This document proposes a new framework for predicting heart disease using machine learning techniques. It first discusses techniques like artificial neural networks and Naive Bayes classification that can be used for classification. It also discusses feature selection techniques like principal component analysis and information gain that can reduce the number of attributes before classification. The proposed framework would take a dataset, apply feature selection to reduce attributes, then use two classification algorithms (ANN and Naive Bayes) on the reduced dataset to select important attributes for heart disease prediction. This is intended to help identify key attributes and predict heart disease symptoms more efficiently.
A Survey on Heart Disease Prediction Techniquesijtsrd
Heart disease is the main reason for a huge number of deaths in the world over the last few decades and has evolved as the most life threatening disease. The health care industry is found to be rich in information. So, there is a need to discover hidden patterns and trends in them. For this purpose, data mining techniques can be applied to extract the knowledge from the large sets of data. Many researchers, in recent times have been using several machine learning techniques for predicting the heart related diseases as it can predict the disease effectively. Even though a machine learning technique proves to be effective in assisting the decision makers, still there is a scope for developing an accurate and efficient system to diagnose and predict the heart diseases thereby helping doctors with ease of work. This paper presents a survey of various techniques used for predicting heart disease and reviews their performance. G. Niranjana | Dr I. Elizabeth Shanthi "A Survey on Heart Disease Prediction Techniques" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-2 , February 2021, URL: https://www.ijtsrd.com/papers/ijtsrd38349.pdf Paper Url: https://www.ijtsrd.com/computer-science/other/38349/a-survey-on-heart-disease-prediction-techniques/g-niranjana
PREDICTION OF MALIGNANCY IN SUSPECTED THYROID TUMOUR PATIENTS BY THREE DIFFER...cscpconf
In the present study, the abilities of three classification methods of data mining namely artificial
neural networks with feed-forward back propagation algorithm, J48 decision tree method and
logistic regression analysis are compared in a medical real dataset. The prediction of
malignancy in suspected thyroid tumour patients is the objective of the study. The accuracy of
the correct predictions (the minimum error rate), the amount of time consuming in the
modelling process and the interpretability and simplicity of the results for clinical experts are
the factors considered to choose the best method
A Survey on Heart Disease Prediction Techniquesijtsrd
Heart disease is the main reason for a huge number of deaths in the world over the last few decades and has evolved as the most life threatening disease. The health care industry is found to be rich in information. So, there is a need to discover hidden patterns and trends in them. For this purpose, data mining techniques can be applied to extract the knowledge from the large sets of data. Many researchers, in recent times have been using several machine learning techniques for predicting the heart related diseases as it can predict the disease effectively. Even though a machine learning technique proves to be effective in assisting the decision makers, still there is a scope for developing an accurate and efficient system to diagnose and predict the heart diseases thereby helping doctors with ease of work. This paper presents a survey of various techniques used for predicting heart disease and reviews their performance. G. Niranjana | Dr I. Elizabeth Shanthi "A Survey on Heart Disease Prediction Techniques" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-2 , February 2021, URL: https://www.ijtsrd.com/papers/ijtsrd38349.pdf Paper Url: https://www.ijtsrd.com/computer-science/other/38349/a-survey-on-heart-disease-prediction-techniques/g-niranjana
PREDICTION OF MALIGNANCY IN SUSPECTED THYROID TUMOUR PATIENTS BY THREE DIFFER...cscpconf
In the present study, the abilities of three classification methods of data mining namely artificial
neural networks with feed-forward back propagation algorithm, J48 decision tree method and
logistic regression analysis are compared in a medical real dataset. The prediction of
malignancy in suspected thyroid tumour patients is the objective of the study. The accuracy of
the correct predictions (the minimum error rate), the amount of time consuming in the
modelling process and the interpretability and simplicity of the results for clinical experts are
the factors considered to choose the best method
Correlation of artificial neural network classification and nfrs attribute fi...eSAT Journals
Abstract
Mostly 5 to 15% of the women in the stage of reproduction face the disease called Polycystic Ovarian Syndrome (PCOS) which is the multifaceted, heterogeneous and complex. The long term consequences diseases like endometrial hyperplasia, type 2 diabetes mellitus and coronary disease are caused by the polycystic ovaries, chronic anovulation and hyperandrogenism are characterized with the resistance of insulin and the hypertension, abdominal obesity and dyslipidemia and hyperinsulinemia are called as Metabolic syndrome (frequent metabolic traits) The above cause the common disease called Anovulatory infertility. Computer based information along with advanced Data mining techniques are used for appropriate results. Classification is a classic data mining task, with roots in machine learning. Naïve Bayesian, Artificial Neural Network, Decision Tree, Support Vector Machines are the classification tasks in the data mining. Feature selection methods involve generation of the subset, evaluation of each subset, criteria for stopping the search and validation procedures. The characteristics of the search method used are important with respect to the time efficiency of the feature selection methods. PCA (Principle Component Analysis), Information gain Subset Evaluation, Fuzzy rough set evaluation, Correlation based Feature Selection (CFS) are some of the feature selection techniques, greedy first search, ranker etc are the search algorithms that are used in the feature selection. In this paper, a new algorithm which is based on Fuzzy neural subset evaluation and artificial neural network is proposed which reduces the task of classification and feature selection separately. This algorithm combines the neural fuzzy rough subset evaluation and artificial neural network together for the better performance than doing the tasks separately.
Keywords: ANN, SVM, PCA, CFS
Medical informatics growth can be observed now days. Advancement in different medical fields
discovers the various critical diseases and provides the guidelines for their cure. This has been possible
only because of well heeled medical databases as well as automation of data analysis process. Towards
this analysis process lots of learning and intelligence is required, the data mining techniques provides the
basis for that and various data mining techniques are available like Decision tree Induction, Rule Based
Classification or mining, Support vector machine, Stochastic classification, Logistic regression, Naïve
bayes, Artificial Neural Network & Fuzzy Logic, Genetic Algorithms. This paper provides the basic of
data mining with their effective techniques availability in medical sciences & reveals the efforts done on
medical databases using data mining techniques for human disease diagnosis.
Framework for efficient transformation for complex medical data for improving...IJECEIAES
The adoption of various technological advancement has been already adopted in the area of healthcare sector. This adoption facilitates involuntary generation of medical data that can be autonomously programmed to be forwarded to a destined hub in the form of cloud storage units. However, owing to such technologies there is massive formation of complex medical data that significantly acts as an overhead towards performing analytical operation as well as unwanted storage utilization. Therefore, the proposed system implements a novel transformation technique that is capable of using a template based stucture over cloud for generating structured data from highly unstructured data in a non-conventional manner. The contribution of the propsoed methodology is that it offers faster processing and storage optimization. The study outcome also proves this fact to show propsoed scheme excels better in performance in contrast to existing data transformation scheme.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Comparative Study of Classification Method on Customer Candidate Data to Pred...IJECEIAES
Leasing vehicles are a company engaged in the field of vehicle loans. Purchase by way of credit becomes a mainstay because it can attract potential customers to generate more profit. But if there is a mistake in approving a customer candidate, the risk of stalled credit payments can happen. To minimize the risk, it can be applied the certain data mining technique to predict the future behavior of the customers. In this study, it is explored in some data mining techniques such as C4.5 and Naive Bayes for this purpose. The customer attributes used in this study are: salary, age, marital status, other installments and worthiness. The experiments are performed by using the Weka software. Based on evaluation criteria, i.e. accuracy, C4.5 algorithm outperforms compared to Naive Bayes. The percentage split experiment scenarios provide the precision value of 89.16% and the accuracy value of 83.33% wheres the cross validation experiment scenarios give the higher accuracy values of all used k-fold. The C4.5 experiment results also confirm that the most influential instant data attribute in this research is the salary.
MULTI MODEL DATA MINING APPROACH FOR HEART FAILURE PREDICTIONIJDKP
Developing predictive modelling solutions for risk estimation is extremely challenging in health-care
informatics. Risk estimation involves integration of heterogeneous clinical sources having different
representation from different health-care provider making the task increasingly complex. Such sources are
typically voluminous, diverse, and significantly change over the time. Therefore, distributed and parallel
computing tools collectively termed big data tools are in need which can synthesize and assist the physician
to make right clinical decisions. In this work we propose multi-model predictive architecture, a novel
approach for combining the predictive ability of multiple models for better prediction accuracy. We
demonstrate the effectiveness and efficiency of the proposed work on data from Framingham Heart study.
Results show that the proposed multi-model predictive architecture is able to provide better accuracy than
best model approach. By modelling the error of predictive models we are able to choose sub set of models
which yields accurate results. More information was modelled into system by multi-level mining which has
resulted in enhanced predictive accuracy.
Classification of medical datasets using back propagation neural network powe...IJECEIAES
The classification is a one of the most indispensable domains in the data mining and machine learning. The classification process has a good reputation in the area of diseases diagnosis by computer systems where the progress in smart technologies of computer can be invested in diagnosing various diseases based on data of real patients documented in databases. The paper introduced a methodology for diagnosing a set of diseases including two types of cancer (breast cancer and lung), two datasets for diabetes and heart attack. Back Propagation Neural Network plays the role of classifier. The performance of neural net is enhanced by using the genetic algorithm which provides the classifier with the optimal features to raise the classification rate to the highest possible. The system showed high efficiency in dealing with databases differs from each other in size, number of features and nature of the data and this is what the results illustrated, where the ratio of the classification reached to 100% in most datasets).
PREDICTIVE ANALYTICS IN HEALTHCARE SYSTEM USING DATA MINING TECHNIQUEScscpconf
The health sector has witnessed a great evolution following the development of new computer technologies, and that pushed this area to produce more medical data, which gave birth to multiple fields of research. Many efforts are done to cope with the explosion of medical data on one hand, and to obtain useful knowledge from it on the other hand. This prompted researchers to apply all the technical innovations like big data analytics, predictive analytics, machine learning and learning algorithms in order to extract useful knowledge and help in making decisions. With the promises of predictive analytics in big data, and the use of machine learning
algorithms, predicting future is no longer a difficult task, especially for medicine because predicting diseases and anticipating the cure became possible. In this paper we will present an overview on the evolution of big data in healthcare system, and we will apply a learning algorithm on a set of medical data. The objective is to predict chronic kidney diseases by using Decision Tree (C4.5) algorithm.
T OP K-O PINION D ECISIONS R ETRIEVAL IN H EALTHCARE S YSTEM csandit
The aim of this paper is to use data mining techniq
ue and opinion mining(OM) concepts to the
field of health informatics. The decision making in
health informatics involves number of
opinions given by the group of medical experts for
specific disease in the form of decision based
opinions which will be presented in medical databas
e in the form of text. These decision based
opinions are then mined from database with the help
of mining technique. Text document
clustering plays major role in the fast developing
information Explosion. It is considered as tool
for performing information based operations. Text d
ocument clustering generates clusters from
whole document collection automatically, normally K
-means clustering technique used for text
document clustering. In this paper we use Bisecting
K-means clustering technique and it is
better compared to traditional K-means technique. T
he objective is to study the revealed
groupings of similar opinion-types associated with
the likelihood of physicians and medical
experts.
The objective of this study was to apply Data Mining in the analysis of imports of pharmaceutical products in Kenya with the aim of discovering patterns of association and correlation among the various product groups. The RapidMiner Data Mining was used to analyze data obtained from the Pharmacy and Poison Board, the regulator of pharmaceutical products in the country, covering the period 2008 to 2010. The CRISP method was used to get a business understanding of the Board, understand the nature of the data held, prepare the data for analyze and actual analysis of the data. The results of the study discovered various patterns through correlation and association analysis of various product groups. The results were presented through various graphs and discussed with the domain experts. These patterns are similar to prescription patterns from studies in Ethiopia, Nigeria and India. The research will provide regulators of pharmaceutical products, not only in Kenya but other African countries, a better insight into the patterns of imports and exports of pharmaceutical products. This would result into better controls, not only in import and exports of the products, but also enforcement on their usage in order to avert negative effects to the citizens
A comprehensive study on disease risk predictions in machine learning IJECEIAES
Over recent years, multiple disease risk prediction models have been developed. These models use various patient characteristics to estimate the probability of outcomes over a certain period of time and hold the potential to improve decision making and individualize care. Discovering hidden patterns and interactions from medical databases with growing evaluation of the disease prediction model has become crucial. It needs many trials in traditional clinical findings that could complicate disease prediction. A Comprehensive study on different strategies used to predict disease is conferred in this paper. Applying these techniques to healthcare data, has improvement of risk prediction models to find out the patients who would get benefit from disease management programs to reduce hospital readmission and healthcare cost, but the results of these endeavors have been shifted.
A Broadcasting Scheme for Message Dissemination in VANETIJERA Editor
Vehicular Ad hoc Networks [VANET] is one of the fastest emerging technologies for research as there are many issues and challenges to be addressed by the researchers before the technology becomes commercialized. Vehicular communication systems developed largely by the growing interest in intelligent transportation systems [ITS]. Cooperative driving can improve safety and efficiency by enabling vehicles to exchange emergency messages to each other in the neighborhood and to assist driver in making proper decision to avoid vehicle collisions and congestion. Broadcast transmission is usually used for disseminating safety related information among vehicles. Message Broadcast over wireless networks poses many challenges due to link unreliability, hidden terminal, message redundancy, and broadcast storm, etc., which greatly degrade the network performance. In most of the emergency situations, there is less time to make a handshake with other nodes in the networks, as the emergency message is to be delivered fast and efficient. Broadcasting information is usually very costly and without limiting techniques this will result in serious data redundancy, contention and collisions. This work focuses on Broadcasting Scheme for Message Dissemination.
Correlation of artificial neural network classification and nfrs attribute fi...eSAT Journals
Abstract
Mostly 5 to 15% of the women in the stage of reproduction face the disease called Polycystic Ovarian Syndrome (PCOS) which is the multifaceted, heterogeneous and complex. The long term consequences diseases like endometrial hyperplasia, type 2 diabetes mellitus and coronary disease are caused by the polycystic ovaries, chronic anovulation and hyperandrogenism are characterized with the resistance of insulin and the hypertension, abdominal obesity and dyslipidemia and hyperinsulinemia are called as Metabolic syndrome (frequent metabolic traits) The above cause the common disease called Anovulatory infertility. Computer based information along with advanced Data mining techniques are used for appropriate results. Classification is a classic data mining task, with roots in machine learning. Naïve Bayesian, Artificial Neural Network, Decision Tree, Support Vector Machines are the classification tasks in the data mining. Feature selection methods involve generation of the subset, evaluation of each subset, criteria for stopping the search and validation procedures. The characteristics of the search method used are important with respect to the time efficiency of the feature selection methods. PCA (Principle Component Analysis), Information gain Subset Evaluation, Fuzzy rough set evaluation, Correlation based Feature Selection (CFS) are some of the feature selection techniques, greedy first search, ranker etc are the search algorithms that are used in the feature selection. In this paper, a new algorithm which is based on Fuzzy neural subset evaluation and artificial neural network is proposed which reduces the task of classification and feature selection separately. This algorithm combines the neural fuzzy rough subset evaluation and artificial neural network together for the better performance than doing the tasks separately.
Keywords: ANN, SVM, PCA, CFS
Medical informatics growth can be observed now days. Advancement in different medical fields
discovers the various critical diseases and provides the guidelines for their cure. This has been possible
only because of well heeled medical databases as well as automation of data analysis process. Towards
this analysis process lots of learning and intelligence is required, the data mining techniques provides the
basis for that and various data mining techniques are available like Decision tree Induction, Rule Based
Classification or mining, Support vector machine, Stochastic classification, Logistic regression, Naïve
bayes, Artificial Neural Network & Fuzzy Logic, Genetic Algorithms. This paper provides the basic of
data mining with their effective techniques availability in medical sciences & reveals the efforts done on
medical databases using data mining techniques for human disease diagnosis.
Framework for efficient transformation for complex medical data for improving...IJECEIAES
The adoption of various technological advancement has been already adopted in the area of healthcare sector. This adoption facilitates involuntary generation of medical data that can be autonomously programmed to be forwarded to a destined hub in the form of cloud storage units. However, owing to such technologies there is massive formation of complex medical data that significantly acts as an overhead towards performing analytical operation as well as unwanted storage utilization. Therefore, the proposed system implements a novel transformation technique that is capable of using a template based stucture over cloud for generating structured data from highly unstructured data in a non-conventional manner. The contribution of the propsoed methodology is that it offers faster processing and storage optimization. The study outcome also proves this fact to show propsoed scheme excels better in performance in contrast to existing data transformation scheme.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Comparative Study of Classification Method on Customer Candidate Data to Pred...IJECEIAES
Leasing vehicles are a company engaged in the field of vehicle loans. Purchase by way of credit becomes a mainstay because it can attract potential customers to generate more profit. But if there is a mistake in approving a customer candidate, the risk of stalled credit payments can happen. To minimize the risk, it can be applied the certain data mining technique to predict the future behavior of the customers. In this study, it is explored in some data mining techniques such as C4.5 and Naive Bayes for this purpose. The customer attributes used in this study are: salary, age, marital status, other installments and worthiness. The experiments are performed by using the Weka software. Based on evaluation criteria, i.e. accuracy, C4.5 algorithm outperforms compared to Naive Bayes. The percentage split experiment scenarios provide the precision value of 89.16% and the accuracy value of 83.33% wheres the cross validation experiment scenarios give the higher accuracy values of all used k-fold. The C4.5 experiment results also confirm that the most influential instant data attribute in this research is the salary.
MULTI MODEL DATA MINING APPROACH FOR HEART FAILURE PREDICTIONIJDKP
Developing predictive modelling solutions for risk estimation is extremely challenging in health-care
informatics. Risk estimation involves integration of heterogeneous clinical sources having different
representation from different health-care provider making the task increasingly complex. Such sources are
typically voluminous, diverse, and significantly change over the time. Therefore, distributed and parallel
computing tools collectively termed big data tools are in need which can synthesize and assist the physician
to make right clinical decisions. In this work we propose multi-model predictive architecture, a novel
approach for combining the predictive ability of multiple models for better prediction accuracy. We
demonstrate the effectiveness and efficiency of the proposed work on data from Framingham Heart study.
Results show that the proposed multi-model predictive architecture is able to provide better accuracy than
best model approach. By modelling the error of predictive models we are able to choose sub set of models
which yields accurate results. More information was modelled into system by multi-level mining which has
resulted in enhanced predictive accuracy.
Classification of medical datasets using back propagation neural network powe...IJECEIAES
The classification is a one of the most indispensable domains in the data mining and machine learning. The classification process has a good reputation in the area of diseases diagnosis by computer systems where the progress in smart technologies of computer can be invested in diagnosing various diseases based on data of real patients documented in databases. The paper introduced a methodology for diagnosing a set of diseases including two types of cancer (breast cancer and lung), two datasets for diabetes and heart attack. Back Propagation Neural Network plays the role of classifier. The performance of neural net is enhanced by using the genetic algorithm which provides the classifier with the optimal features to raise the classification rate to the highest possible. The system showed high efficiency in dealing with databases differs from each other in size, number of features and nature of the data and this is what the results illustrated, where the ratio of the classification reached to 100% in most datasets).
PREDICTIVE ANALYTICS IN HEALTHCARE SYSTEM USING DATA MINING TECHNIQUEScscpconf
The health sector has witnessed a great evolution following the development of new computer technologies, and that pushed this area to produce more medical data, which gave birth to multiple fields of research. Many efforts are done to cope with the explosion of medical data on one hand, and to obtain useful knowledge from it on the other hand. This prompted researchers to apply all the technical innovations like big data analytics, predictive analytics, machine learning and learning algorithms in order to extract useful knowledge and help in making decisions. With the promises of predictive analytics in big data, and the use of machine learning
algorithms, predicting future is no longer a difficult task, especially for medicine because predicting diseases and anticipating the cure became possible. In this paper we will present an overview on the evolution of big data in healthcare system, and we will apply a learning algorithm on a set of medical data. The objective is to predict chronic kidney diseases by using Decision Tree (C4.5) algorithm.
T OP K-O PINION D ECISIONS R ETRIEVAL IN H EALTHCARE S YSTEM csandit
The aim of this paper is to use data mining techniq
ue and opinion mining(OM) concepts to the
field of health informatics. The decision making in
health informatics involves number of
opinions given by the group of medical experts for
specific disease in the form of decision based
opinions which will be presented in medical databas
e in the form of text. These decision based
opinions are then mined from database with the help
of mining technique. Text document
clustering plays major role in the fast developing
information Explosion. It is considered as tool
for performing information based operations. Text d
ocument clustering generates clusters from
whole document collection automatically, normally K
-means clustering technique used for text
document clustering. In this paper we use Bisecting
K-means clustering technique and it is
better compared to traditional K-means technique. T
he objective is to study the revealed
groupings of similar opinion-types associated with
the likelihood of physicians and medical
experts.
The objective of this study was to apply Data Mining in the analysis of imports of pharmaceutical products in Kenya with the aim of discovering patterns of association and correlation among the various product groups. The RapidMiner Data Mining was used to analyze data obtained from the Pharmacy and Poison Board, the regulator of pharmaceutical products in the country, covering the period 2008 to 2010. The CRISP method was used to get a business understanding of the Board, understand the nature of the data held, prepare the data for analyze and actual analysis of the data. The results of the study discovered various patterns through correlation and association analysis of various product groups. The results were presented through various graphs and discussed with the domain experts. These patterns are similar to prescription patterns from studies in Ethiopia, Nigeria and India. The research will provide regulators of pharmaceutical products, not only in Kenya but other African countries, a better insight into the patterns of imports and exports of pharmaceutical products. This would result into better controls, not only in import and exports of the products, but also enforcement on their usage in order to avert negative effects to the citizens
A comprehensive study on disease risk predictions in machine learning IJECEIAES
Over recent years, multiple disease risk prediction models have been developed. These models use various patient characteristics to estimate the probability of outcomes over a certain period of time and hold the potential to improve decision making and individualize care. Discovering hidden patterns and interactions from medical databases with growing evaluation of the disease prediction model has become crucial. It needs many trials in traditional clinical findings that could complicate disease prediction. A Comprehensive study on different strategies used to predict disease is conferred in this paper. Applying these techniques to healthcare data, has improvement of risk prediction models to find out the patients who would get benefit from disease management programs to reduce hospital readmission and healthcare cost, but the results of these endeavors have been shifted.
A Broadcasting Scheme for Message Dissemination in VANETIJERA Editor
Vehicular Ad hoc Networks [VANET] is one of the fastest emerging technologies for research as there are many issues and challenges to be addressed by the researchers before the technology becomes commercialized. Vehicular communication systems developed largely by the growing interest in intelligent transportation systems [ITS]. Cooperative driving can improve safety and efficiency by enabling vehicles to exchange emergency messages to each other in the neighborhood and to assist driver in making proper decision to avoid vehicle collisions and congestion. Broadcast transmission is usually used for disseminating safety related information among vehicles. Message Broadcast over wireless networks poses many challenges due to link unreliability, hidden terminal, message redundancy, and broadcast storm, etc., which greatly degrade the network performance. In most of the emergency situations, there is less time to make a handshake with other nodes in the networks, as the emergency message is to be delivered fast and efficient. Broadcasting information is usually very costly and without limiting techniques this will result in serious data redundancy, contention and collisions. This work focuses on Broadcasting Scheme for Message Dissemination.
Mathematical Modeling of Cylindrical Dielectric Resonator Antenna for Applica...IJERA Editor
We are moving forward in an era where adaptive antenna arrays will be capable of identifying the direction of the incoming signal and steering the transmitted beam in appropriate directions. It has already been proposed that Dielectric Resonator Antennas (DRAs) can be good candidates for such applications. In this paper, we have carefully analyzed the theoretical model of a DRA and have proposed various mathematical methods for its analysis. The methods proposed herein can reduce the complexity of analysis and design of circuits involving DRAs.
Investigation on SS316, SS440C, and Titanium Alloy Grade-5 used as Single Poi...IJERA Editor
The main objective of this work is to find alternative materials for the cutting tools used in turning operations. The conventional materials like tungsten carbide(WC), titanium carbide(TiC), cubic boron nitride (CBN) and diamond used as cutting tools for turning operations on lathe are expensive. Titanium grade 5 (Ti-6Al-4V), SS440C/AISI440C and SS316 are some of the materials which satisfy the necessary requirements for turning metals and polymer materials. These materials are machined as per the standard tool signature of high-speed steel tool (HSS) and are subjected to necessary heat treatment for hardening and then finish ground. The machined tools thus prepared were used to turn mild steel and aluminium workpieces. The cutting forces at play are determined using lathe tool dynamometer and plotted on a MCD (Merchant’s Circle Diagram). The cutting tools are also subjected to tests to determine tool life, wear and work hardening. It is found that the performance and tool life of SS440C is better and cost effective compared to existing tools. Even though Ti-6Al-4V is comparatively costly it could be used for obtaining good surface finish.
Data Retrieval Scheduling For Unsynchronized Channel in Wireless Broadcast Sy...IJERA Editor
Wireless data broadcast is a disseminating data into large number of mobile clients. In many information services, the users may query multiple data items at a time. The environment under consideration is asymmetric in that the information server has much more bandwidth available, as compared to the clients. To maximize the number of downloads given a deadline. It defines a problem called largest number data retrieval (LNDR). To prove the decision problem of LNDR is a NP hard, and to investigate approximation algorithm for it. It also define another problem called minimum cost data retrieval (MCDR), which aims at downloading a set of requested data items with the least response time and energy consumption. Data scheduling problem over unsynchronized channel at server side. In proposed system LNDR and MCDR in push based and pull based broadcast system are used. The proposed approximation algorithms efficiently schedule the data retrieval process of downloading multiple data from multiple channels. Push based and pull based broadcast model are used in unsynchronized channel. When the time needed for channel switching can be ignored, a Maximum Matching optimal algorithm is exhibited for LNDR which requires only polynomial time. The switching time cannot be neglected, finally to provide simulation results to demonstrate the practical efficiency of the proposed algorithms.
A Comprehensive Introduction of the Finite Element Method for Undergraduate C...IJERA Editor
A simple and comprehensive introduction of the Finite Element Method for undergraduate courses is proposed. With very simple mathematics, students can easily understand it. The primary objective is to make students comfortable with the approach and cognizant of its potentials. The technique is based on the general overview of the steps involved in the solution of a typical finite element problem. This is followed by simple examples of mathematical developments which allow developing and demonstrating the major aspects of the finite element approach unencumbered by complicating factors.
Comparison of the FE/FE and FV/FE treatment of fluid-structure interactionIJERA Editor
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
“3D Turtle Graphics” by using a 3D PrinterIJERA Editor
When creating shapes by using a 3D printer, usually, a static (declarative) model designed by using a 3D CAD system is translated to a CAM program and it is sent to the printer. However, widely-used FDM-type 3D printers input a dynamical (procedural) program that describes control of motions of the print head and extrusion of the filament. If the program is expressed by using a programming language or a library in a straight manner, solids can be created by a method similar to turtle graphics. An open-source library that enables “turtle 3D printing” method was described by Python and tested. Although this method currently has a problem that it cannot print in the air; however, if this problem is solved by an appropriate method, shapes drawn by 3D turtle graphics freely can be embodied by this method.
Sentiment Analysis Using Hybrid Approach: A SurveyIJERA Editor
Sentiment analysis is the process of identifying people’s attitude and emotional state’s from language. The main objective is realized by identifying a set of potential features in the review and extracting opinion expressions about those features by exploiting their associations. Opinion mining, also known as Sentiment analysis, plays an important role in this process. It is the study of emotions i.e. Sentiments, Expressions that are stated in natural language. Natural language techniques are applied to extract emotions from unstructured data. There are several techniques which can be used to analysis such type of data. Here, we are categorizing these techniques broadly as ”supervised learning”, ”unsupervised learning” and ”hybrid techniques”. The objective of this paper is to provide the overview of Sentiment Analysis, their challenges and a comparative analysis of it’s techniques in the field of Natural Language Processing.
Evaluation on Low Temperature Performance of Recycled Asphalt Mixture Using W...IJERA Editor
In this paper, the basic idea is about the recycled technology of asphalt mixture, more in-depth study of the low-temperature performance of warm mix asphalt(WMA).First of all, Including the evaluation of low temperature performance of WMA made of reclaimed asphalt pavement (RAP) (passed and not passed 2.36mm screen), and the influence of WMA with RAP mixed of different dosage of dispersant. Then, using the SBS modified asphalt and base asphalt were test at low temperature, research on the influence of different type of asphalt to the low temperature performance of WMA.
IRDO: Iris Recognition by fusion of DTCWT and OLBPIJERA Editor
Iris Biometric is a physiological trait of human beings. In this paper, we propose Iris an Recognition using
Fusion of Dual Tree Complex Wavelet Transform (DTCWT) and Over Lapping Local Binary Pattern (OLBP)
Features. An eye is preprocessed to extract the iris part and obtain the Region of Interest (ROI) area from an iris.
The complex wavelet features are extracted for region from the Iris DTCWT. OLBP is further applied on ROI to
generate features of magnitude coefficients. The resultant features are generated by fusing DTCWT and OLBP
using arithmetic addition. The Euclidean Distance (ED) is used to compare test iris with database iris features to
identify a person. It is observed that the values of Total Success Rate (TSR) and Equal Error Rate (EER) are
better in the case of proposed IRDO compared to the state-of-the art techniques.
A Multipath Connection Model for Traffic MatricesIJERA Editor
Peer-to-Peer (P2P) applications have witnessed an increasing popularity in recent years, which brings new challenges to network management and traffic engineering (TE). As basic input information, P2P traffic matrices are of significant importance for TE. Because of the excessively high cost of direct measurement. In this paper,A multipath connection model for traffic matrices in operational networks. Media files can share the peer to peer, the localization ratio of peer to peer traffic. This evaluates its performance using traffic traces collected from both the real peer to peer video-on-demand and file-sharing applications. The estimation of the general traffic matrices (TM) then used for sending the media file without traffic. Share the media file, source to destination traffic is not occur. So it give high performance and short time process.
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
Region based elimination of noise pixels towards optimized classifier models ...IJERA Editor
The extraction of the skin pixels in a human image and rejection of non-skin pixels is called the skin segmentation. Skin pixel detection is the process of extracting the skin pixels in a human image which is typically used as a pre-processing step to extract the face regions from human image. In past, there are several computer vision approaches and techniques have been developed for skin pixel detection. In the process of skin detection, given pixels are been transformed into an appropriate color space such as RGB, HSV etc. And then skin classifier model have been applied to label the pixel into skin or non-skin regions. Here in this research a “Region based elimination of noise pixels and performance analysis of classifier models for skin pixel detection applied on human images” would be performed which involve the process of image representation in color models, elimination of non-skin pixels in the image, and then pre-processing and cleansing of the collected data, feature selection of the human image and then building the model for classifier. In this research and implementation of skin pixels classifier models are proposed with their comparative performance analysis. The definition of the feature vector is simply the selection of skin pixels from the human image or stack of human images. The performance is evaluated by comparing and analysing skin colour segmentation algorithms. During the course of research implementation, efforts are iterative which help in selection of optimized skin classifier based on the machine learning algorithms and their performance analysis.
Neural Network Algorithm for Radar Signal RecognitionIJERA Editor
Nowadays, the traditional recognition method could not match the development of radar signals. In this paper, based on fractal theory and Neural Network, a new radar signal recognition algorithm is presented. The relevant point is extracted as the input of neutral network, and then it will recognize and classify the signals. Simulation results show that, this algorithm has a distinguish effect on classification under the condition of low SNR.
A Study of Anodic Voltage Drop in Aluminum Reduction Cell by Finite Element A...IJERA Editor
Aluminum extraction has a very high energy consumption process, so reducing energy consumption is one of
the most important roles in aluminum reduction cell design. The good path to achieve this goal can be made by
voltage savings at the anode assembly.
The aim of this work is todevelop3D thermo-electrical finite element model and validate based on actual
temperature measurements and electrical calculations for the anode assembly. The model is used to estimate the
temperature distribution and the anodic voltage drop over the anode assembly and to suggest alternative design
modifications to reduce the anodic voltage drop.
The effect of changing in stub diameter and chemical composition of cast iron on anodic voltage drop were
studied. The findings indicated that the effect of stub diameter is more effective as compare with the changing in
cast iron composition.
Properties of Caputo Operator and Its Applications to Linear Fractional Diffe...IJERA Editor
The purpose of this paper is to demonstrate the power of two mostly used definitions for fractional differentiation, namely, the Riemann-Liouville and Caputo fractional operator to solve some linear fractional-order differential equations. The emphasis is given to the most popular Caputo fractional operator which is more suitable for the study of differential equations of fractional order..Illustrative examples are included to demonstrate the procedure of solution of couple of fractional differential equations having Caputo operator using Laplace transformation. Itshows that the Laplace transforms is a powerful and efficient technique for obtaining analytic solution of linear fractional differential equations
A Study and Brief Investigation on Invisibility Of Aircrafts (Vimanas)IJERA Editor
Making aircrafts invisible to human eye would be a great advantage in defence and military use. The Vedas show us evidence that the ancient ancestors have advanced technology to make a vimana invisible. They had great knowledge on the sun rays, and energies extraction which can create the invisibility for the vimana. The paper describes the various writings and investigation on the topics, comparing with the modern findings of solar rays usage which lead to the better understanding of Ancient advanced vimanas and provide a path to the findings of invisibility.
DATA MINING CLASSIFICATION ALGORITHMS FOR KIDNEY DISEASE PREDICTION IJCI JOURNAL
Data mining is a non-trivial process of categorizing valid, novel, potentially useful and ultimately understandable patterns in data. In terms, it accurately state as the extraction of information from a huge database. Data mining is a vital role in several applications such as business organizations, educational institutions, government sectors, health care industry, scientific and engineering. . In the health care
industry, the data mining is predominantly used for disease prediction. Enormous data mining techniques are existing for predicting diseases namely classification, clustering, association rules, summarizations, regression and etc. The main objective of this research work is to predict kidney diseases using classification algorithms such as Naïve Bayes and Support Vector Machine. This research work mainly
focused on finding the best classification algorithm based on the classification accuracy and execution time performance factors. From the experimental results it is observed that the performance of the SVM is better than the Naive Bayes classifier algorithm.
Enhanced Detection System for Trust Aware P2P Communication NetworksEditor IJCATR
Botnet is a number of computers that have been set up to forward transmissions to other computers unknowingly to the user
of the system and it is most significant to detect the botnets. However, peer-to-peer (P2P) structured botnets are very difficult to detect
because, it doesn’t have any centralized server. In this paper, we deliver an infrastructure of P2P that will improve the trust of the peers
and its data. In order to identify the botnets we provide a technique called data provenance integrity. It will ensure the correct origin or
source of information and prevents opponents from using host resources. A reputation based trust model is used for selecting the
trusted peer. In this model, each peer has a reputation value which is calculated based on its past activity. Here a hash table is used for
efficient file searching and data stored in it is based on the reputation value.
C omparative S tudy of D iabetic P atient D ata’s U sing C lassification A lg...Editor IJCATR
Data mining refers to extracting knowledge from large amount of data. Real life data mining approaches are
interesting because they often present a different se
t of problems for
diabetic
patient’s
data
.
The
research area to solve
various problems and classification is one of main problem in the field. The research describes algorithmic discussion of J48
,
J48 Graft, Random tree, REP, LAD. Here used to compare the
performance of computing time, correctly classified
instances, kappa statistics, MAE, RMSE, RAE, RRSE and
to find the error rate measurement for different classifiers in
weka .In this paper the
data
classification is diabetic patients data set is develope
d by collecting data from hospital repository
consists of 1865 instances with different attributes. The instances in the dataset are two categories of blood tests, urine t
ests.
Weka tool is used to classify the data is evaluated using 10 fold cross validat
ion and the results are compared. When the
performance of algorithms
,
we found J48 is better algorithm in most of the cases
Comparative Study of Diabetic Patient Data’s Using Classification Algorithm i...Editor IJCATR
Data mining refers to extracting knowledge from large amount of data. Real life data mining approaches are
interesting because they often present a different set of problems for diabetic patient’s data. The research area to solve
various problems and classification is one of main problem in the field. The research describes algorithmic discussion of J48,
J48 Graft, Random tree, REP, LAD. Here used to compare the performance of computing time, correctly classified
instances, kappa statistics, MAE, RMSE, RAE, RRSE and to find the error rate measurement for different classifiers in
weka .In this paper the data classification is diabetic patients data set is developed by collecting data from hospital repository
consists of 1865 instances with different attributes. The instances in the dataset are two categories of blood tests, urine tests.
Weka tool is used to classify the data is evaluated using 10 fold cross validation and the results are compared. When the
performance of algorithms, we found J48 is better algorithm in most of the cases.
ICU Patient Deterioration Prediction : A Data-Mining Approachcsandit
A huge amount of medical data is generated every da
y, which presents a challenge in analysing
these data. The obvious solution to this challenge
is to reduce the amount of data without
information loss. Dimension reduction is considered
the most popular approach for reducing
data size and also to reduce noise and redundancies
in data. In this paper, we investigate the
effect of feature selection in improving the predic
tion of patient deterioration in ICUs. We
consider lab tests as features. Thus, choosing a su
bset of features would mean choosing the
most important lab tests to perform. If the number
of tests can be reduced by identifying the
most important tests, then we could also identify t
he redundant tests. By omitting the redundant
tests, observation time could be reduced and early
treatment could be provided to avoid the risk.
Additionally, unnecessary monetary cost would be av
oided. Our approach uses state-of-the-art
feature selection for predicting ICU patient deteri
oration using the medical lab results. We
apply our technique on the publicly available MIMIC
-II database and show the effectiveness of
the feature selection. We also provide a detailed a
nalysis of the best features identified by our
approach.
ICU PATIENT DETERIORATION PREDICTION: A DATA-MINING APPROACHcscpconf
A huge amount of medical data is generated every day, which presents a challenge in analysing
these data. The obvious solution to this challenge is to reduce the amount of data without
information loss. Dimension reduction is considered the most popular approach for reducing
data size and also to reduce noise and redundancies in data. In this paper, we investigate the
effect of feature selection in improving the prediction of patient deterioration in ICUs. We
consider lab tests as features. Thus, choosing a subset of features would mean choosing the
most important lab tests to perform. If the number of tests can be reduced by identifying the
most important tests, then we could also identify the redundant tests. By omitting the redundant
tests, observation time could be reduced and early treatment could be provided to avoid the risk.
Additionally, unnecessary monetary cost would be avoided. Our approach uses state-of-the-art
feature selection for predicting ICU patient deterioration using the medical lab results. We
apply our technique on the publicly available MIMIC-II database and show the effectiveness of
the feature selection. We also provide a detailed analysis of the best features identified by our
approach.
A comparative analysis of classification techniques on medical data setseSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
An efficient feature selection algorithm for health care data analysisjournalBEEI
Diabete is a silent killer, which will slowly kill the person if it goes undetected. The existing system which uses F-score method and K-means clustering of checking whether a person has diabetes or not are 100% accurate, and anything which isn't a 100% is not acceptable in the medical field, as it could cost the lives of many people. Our proposed system aims at using some of the best features of the existing algorithms to predict diabetes, and combine these and based on these features; This research work turns them into a novel algorithm, which will be 100% accurate in its prediction. With the surge in technological advancements, we can use data mining to predict when a person would be diagnosed with diabetes. Specifically, we analyze the best features of chi-square algorithm and advanced clustering algorithm (ACA). This research work is done using the Pima Indian Diabetes dataset provided by National Institutes of Diabetes and Digestive and Kidney Diseases. Using classification theorems and methods we can consider different factors like age, BMI, blood pressure and the importance given to these attributes overall, and singles these attributes out, and use them for the prediction of diabetes.
Disease prediction in big data healthcare using extended convolutional neural...IJAAS Team
Diabetes Mellitus is one of the growing fatal diseases all over the world. It leads to complications that include heart disease, stroke, and nerve disease, kidney damage. So, Medical Professionals want a reliable prediction system to diagnose Diabetes. To predict the diabetes at earlier stage, different machine learning techniques are useful for examining the data from different sources and valuable knowledge is synopsized. So, mining the diabetes data in an efficient way is a crucial concern. In this project, a medical dataset has been accomplished to predict the diabetes. The R-Studio and Pypark software was employed as a statistical computing tool for diagnosing diabetes. The PIMA Indian database was acquired from UCI repository will be used for analysis. The dataset was studied and analyzed to build an effective model that predicts and diagnoses the diabetes disease earlier.
Explore the innovative world of trenchless pipe repair with our comprehensive guide, "The Benefits and Techniques of Trenchless Pipe Repair." This document delves into the modern methods of repairing underground pipes without the need for extensive excavation, highlighting the numerous advantages and the latest techniques used in the industry.
Learn about the cost savings, reduced environmental impact, and minimal disruption associated with trenchless technology. Discover detailed explanations of popular techniques such as pipe bursting, cured-in-place pipe (CIPP) lining, and directional drilling. Understand how these methods can be applied to various types of infrastructure, from residential plumbing to large-scale municipal systems.
Ideal for homeowners, contractors, engineers, and anyone interested in modern plumbing solutions, this guide provides valuable insights into why trenchless pipe repair is becoming the preferred choice for pipe rehabilitation. Stay informed about the latest advancements and best practices in the field.
Quality defects in TMT Bars, Possible causes and Potential Solutions.PrashantGoswami42
Maintaining high-quality standards in the production of TMT bars is crucial for ensuring structural integrity in construction. Addressing common defects through careful monitoring, standardized processes, and advanced technology can significantly improve the quality of TMT bars. Continuous training and adherence to quality control measures will also play a pivotal role in minimizing these defects.
Student information management system project report ii.pdfKamal Acharya
Our project explains about the student management. This project mainly explains the various actions related to student details. This project shows some ease in adding, editing and deleting the student details. It also provides a less time consuming process for viewing, adding, editing and deleting the marks of the students.
Cosmetic shop management system project report.pdfKamal Acharya
Buying new cosmetic products is difficult. It can even be scary for those who have sensitive skin and are prone to skin trouble. The information needed to alleviate this problem is on the back of each product, but it's thought to interpret those ingredient lists unless you have a background in chemistry.
Instead of buying and hoping for the best, we can use data science to help us predict which products may be good fits for us. It includes various function programs to do the above mentioned tasks.
Data file handling has been effectively used in the program.
The automated cosmetic shop management system should deal with the automation of general workflow and administration process of the shop. The main processes of the system focus on customer's request where the system is able to search the most appropriate products and deliver it to the customers. It should help the employees to quickly identify the list of cosmetic product that have reached the minimum quantity and also keep a track of expired date for each cosmetic product. It should help the employees to find the rack number in which the product is placed.It is also Faster and more efficient way.
Water scarcity is the lack of fresh water resources to meet the standard water demand. There are two type of water scarcity. One is physical. The other is economic water scarcity.
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSEDuvanRamosGarzon1
AIRCRAFT GENERAL
The Single Aisle is the most advanced family aircraft in service today, with fly-by-wire flight controls.
The A318, A319, A320 and A321 are twin-engine subsonic medium range aircraft.
The family offers a choice of engines
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)MdTanvirMahtab2
This presentation is about the working procedure of Shahjalal Fertilizer Company Limited (SFCL). A Govt. owned Company of Bangladesh Chemical Industries Corporation under Ministry of Industries.
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Dr.Costas Sachpazis
Terzaghi's soil bearing capacity theory, developed by Karl Terzaghi, is a fundamental principle in geotechnical engineering used to determine the bearing capacity of shallow foundations. This theory provides a method to calculate the ultimate bearing capacity of soil, which is the maximum load per unit area that the soil can support without undergoing shear failure. The Calculation HTML Code included.
Welcome to WIPAC Monthly the magazine brought to you by the LinkedIn Group Water Industry Process Automation & Control.
In this month's edition, along with this month's industry news to celebrate the 13 years since the group was created we have articles including
A case study of the used of Advanced Process Control at the Wastewater Treatment works at Lleida in Spain
A look back on an article on smart wastewater networks in order to see how the industry has measured up in the interim around the adoption of Digital Transformation in the Water Industry.
Event Management System Vb Net Project Report.pdfKamal Acharya
In present era, the scopes of information technology growing with a very fast .We do not see any are untouched from this industry. The scope of information technology has become wider includes: Business and industry. Household Business, Communication, Education, Entertainment, Science, Medicine, Engineering, Distance Learning, Weather Forecasting. Carrier Searching and so on.
My project named “Event Management System” is software that store and maintained all events coordinated in college. It also helpful to print related reports. My project will help to record the events coordinated by faculties with their Name, Event subject, date & details in an efficient & effective ways.
In my system we have to make a system by which a user can record all events coordinated by a particular faculty. In our proposed system some more featured are added which differs it from the existing system such as security.
Propose a Enhanced Framework for Prediction of Heart Disease
1. K. Sudhakar Int. Journal of Engineering Research and Applications www.ijera.com
ISSN : 2248-9622, Vol. 5, Issue 4, ( Part -2) April 2015, pp.01-06
www.ijera.com 1 | P a g e
Propose a Enhanced Framework for Prediction of Heart Disease
K. Sudhakar*, Dr. M. Manimekalai**
*Research Scholar, Department of Computer Applications, Shrimati Indira Gandhi College, Trichy, Tamilnadu,
India
** Director and Head, Department of Computer Applications, Shrimati Indira Gandhi College, Trichy,
Tamilnadu, India
ABSTRACT
Heart disease diagnosis requires more experience and it is a complex task. The Heart MRI, ECG and Stress Test
etc are the numbers of medical tests are prescribed by the doctor for examining the heart disease and it is the way
of tradition in the prediction of heart disease. Today world, the hidden information of the huge amount of health
care data is contained by the health care industry. The effective decisions are made by means of this hidden
information. For appropriate results, the advanced data mining techniques with the information which is based
on the computer are used. In any empirical sciences, for the inference and categorisation, the new mathematical
techniques to be used called Artificial neural networks (ANNs) it also be used to the modelling of the real neural
networks. Acting, Wanting, knowing, remembering, perceiving, thinking and inferring are the nature of mental
phenomena and these can be understand by using the theory of ANN. The problem of probability and induction
can be arised for the inference and classification because these are the powerful instruments of ANN. In this
paper, the classification techniques like Naive Bayes Classification algorithm and Artificial Neural Networks are
used to classify the attributes in the given data set. The attribute filtering techniques like PCA (Principle
Component Analysis) filtering and Information Gain Attribute Subset Evaluation technique for feature selection
in the given data set to predict the heart disease symptoms. A new framework is proposed which is based on the
above techniques, the framework will take the input dataset and fed into the feature selection techniques block,
which selects any one techniques that gives the least number of attributes and then classification task is done
using two algorithms, the same attributes that are selected by two classification task is taken for the prediction of
heart disease. This framework consumes the time for predicting the symptoms of heart disease which make the
user to know the important attributes based on the proposed framework.
Keywords – ANN, PCA, SVM, CFS.
I. INTRODUCTION
In the past 10 years Heart disease becomes the
major cause for death all around the globe (World
Health Organization 2007) [1]. To help the
professionals of health care, the several data mining
techniques are used by the researchers in the findings
of heart disease. The European Public Health
Alliance stated that heart attacks, strokes and other
circulatory diseases is accounted as 41% of all deaths
(European Public Health Alliance 2010) [2]. The one
fifth lives of the Asian are lost due to the non
communicable disease such as chronic respiratory
diseases, cardiovascular diseases and cancer etc and
this is descripted in the ESCAP (Economical and
Social Communication of Asia and Pacific 2010) [2].
ABS (The Australian Bureau of Statistics) described
that circulatory system diseases and heart diseases are
the primary reason for death in Australia, causing
33.7% all deaths (Australian Bureau of Statistics
2010) [3]. The heart disease patients were motivated
around the globe every year. In addition to the
availability of large amount of patients’ data from
which to extort useful knowledge, in the diagnosis of
heart disease, for facilitating the professionals of
health care, the data mining techniques have been
used by the researchers [4]. Nowadays, data mining
is the exploration of large datasets to extort hidden
and formerly unknown patterns, relationships and
knowledge that are complicated to detect with
conventional statistical methods. In the emerging
field of healthcare data mining plays a major role to
extract the details for the deeper understanding of the
medical data in the providing of prognosis [5]. Due to
the development of modern technology, data mining
applications in healthcare consist about the analysis
of health care centres for enhancement of health
policy-making and prevention of hospital errors,
early detection, prevention of diseases and
preventable hospital deaths, more value for money
and cost savings, and detection of fraudulent
insurance claims.
The characteristic selection has been an energetic
and productive in the field of research area through
pattern recognition, machine learning, statistics and
data mining communities. The main intention of
attribute selection is to choose a subset of input
variables by eradicating features, which are irrelevant
or of non prognostic information. Feature selection
RESEARCH ARTICLE OPEN ACCESS
2. K. Sudhakar Int. Journal of Engineering Research and Applications www.ijera.com
ISSN : 2248-9622, Vol. 5, Issue 4, ( Part -2) April 2015, pp.01-06
www.ijera.com 2 | P a g e
[6] has proven in both theory and practice to be
valuable in ornamental learning efficiency, escalating
analytical accuracy and reducing complexity of well-
read results. Feature selection in administered
learning has a chief objective in finding a feature
subset that fabricates higher classification accuracy.
The number of feature N increases because the
expansions of the domain dimensionality. Among
that finding an optimal feature subset is intractable
and exertions associated feature selections have been
demonstrated to be NP-hard. At this point, it is
crucial to depict the traditional feature selection
process, which consists of four basic steps, namely,
validation of the subset, stopping criterion, evaluation
of subset and subset generation. Subset generation is
a investigation process that generates the candidate
feature subsets for assessment based on a certain
search strategy. Depends on the certain assessment,
the comparison with the best prior one and each
candidate subset is evaluated. If the new subset
revolves to be better, it reinstates best one. Whenever
the stopping condition is fulfilled until the process is
repeated. There are large number of features that can
exceed the number of data themselves often
exemplifies the data used in ML [7]. This kind of
problem is known as "the curse of dimensionality"
generates a challenge for a mixture of ML
applications for decision support. This can amplify
the risk of taking into account correlated or redundant
attributes which can lead to lower classification
accuracy. As a result, the process of eliminating
irrelevant features is a crucial phase for designing the
decision support systems with high accuracy.
In this technical world, data mining is the only
consistent source accessible to unravel the intricacy
of congregated data. Meanwhile, the two categories
of data mining tasks can be generally categorized
such as descriptive and predictive. Descriptive
mining tasks illustrate the common attributes of the
data in the database. Predictive mining tasks execute
implication on the present data in order to formulate
the predictions. Data available for mining is raw data.
The data collects from different source, therefore the
format may be different. Moreover, it may consist of
noisy data, irrelevant attributes, missing data etc.
Discretization – Once the data mining algorithm
cannot cope with continuous attributes, discretization
[8] needs to be employed. However, this step consists
of transforming a continuous attribute into an
unconditional attribute, taking only a small number
of isolated values. Frequently, Discretization often
improves the comprehensibility of the discovered
knowledge. Attribute Selection – not all attributes are
relevant and so for selecting a subset of attributes
relevant for mining, among all original attributes,
attribute selection [9] is mandatory.
II. CLASSIFICATION TECHNQIUES
The most commonly used data mining technique is
the classification that occupies a set of pre-classified
patterns to develop a model that can categorize the
population of records at large. The learning and
classification is involved by the process called the
data classification. By the classification algorithm,
the training data are analyzed in the learning [4] [10].
The approximation of the classification rules, the test
data are used in the classification. To the new data
tuples, the rules can be applied when the accuracy is
acceptable. To verify the set of parameters which is
needed for the proper discrimination, the pre-
classified examples are used in the classifier-training
algorithm. The model which is called as a classifier,
only after these parameters encoded by the algorithm.
Artificial neural network, Naive Bayesian
classification algorithm classification techniques are
used in this paper.
1. Artificial Neural Network
Moreover, the realistic provisional terms in
neural networks are non-linear statistical data
modelling tools. For discovering the patterns or
modelling the complex relationships between inputs
and outputs the neural network can be used. The
process of collecting information from datasets is the
data warehousing firms also known as data mining by
using neural network tool [11]. The more informed
decisions are made by users that helping data of the
cross-fertilization and there is distinction between
these data warehouse and ordinary databases and
there is an authentic manipulation.
Figure 1: Example ANN
Among the algorithms the most popular neural
network algorithms are Hopfield, Multilayer
perception, counter propagation networks, radial
basis function and self organizing maps etc. In which,
the feed forward neural network was the first and
simplest type of artificial neural network consists of 3
units input layer, hidden layer and output layer. There
are no cycles or loops in this network. A neural
network [12] has to be configured to fabricate the
required set of outputs. Basically there are three
learning conditions for neural network. 1) Supervised
Learning, 2) Unsupervised Learning, 3)
Reinforcement learning the perception is the basic
unit of an artificial neural network used for
3. K. Sudhakar Int. Journal of Engineering Research and Applications www.ijera.com
ISSN : 2248-9622, Vol. 5, Issue 4, ( Part -2) April 2015, pp.01-06
www.ijera.com 3 | P a g e
classification where patterns are linearly separable.
The basic model of neuron used in perception is the
Mcculluchpitts model. The learning for artificial
neural networks are as follows:
Step 1: Ret D= {{Xi,Yi}/i=1, 2, 3----n}} be the set of
training example.
Step 2: Initialize the weight vector with random
value,W (o).
Step 3: Repeat.
Step 4:For each training sample (Xi, Yi) ∈ D.
Step 5: Compute the predicted output Yi^ (k)
Step 6: For each weight we does.
Step 7: Update the weight we (k+1) = Wj(k) +(y i -
yi^ (k))xij.
Step 8: End for.
Step 9: End for.
Step 10: Until stopping criteria is met.
2. Naïve Bayesian Classification Technique
The Bayesian Classification signifies a
supervised learning method and also as a statistical
method for classification [11]. In which it assumes a
fundamental probabilistic model and it allocates us to
incarcerate uncertainty about the model in a ethical
way by determining probabilities of the results. And
also it can unravel diagnostic and predictive
problems. The Bayesian theorem as follows:
Given training data Y, posterior probability of a
hypothes32is I, R(I|Y), follows the Bayes theorem
R(I|Y)=R(Y|I)R(I)/R(Y).
Algorithm:
The Naive Bayes algorithm is based on Bayesian
theorem as given by above equation.
Step 1: Each data sample is represented by an n
dimensional feature vector, S = (s1, s2….. sm),
depicting n measurements made on the sample from
n attributes, respectively S1, S2, Sn.
Step 2: Suppose that there are m classes, T1,
T2……Tk. Given an unknown data sample, Y (i.e.,
having no class label), the classifier will predict that
Y belongs to the class having the highest posterior
probability, conditioned if and only if:
R(Tj/Y)>R(Tl/Y) for all 1< = l< = k and l!= j
R(Tj|Y) is maximized then. R(Tj|Y) is maximized for
the class Tj is called the maximum posterior
hypothesis. By Bayes theorem,
Step 3: Only R(Y|Tl)R(Tj) need be maximized when
R(Y) is constant for all classes. It is assumed that the
classes are equally likely when the class prior
probabilities are not known, i.e. R(T1) = R(T2) =
…..= R(Tk), and consequently we would maximize
R(Y|Tj). or else, we maximize R(Y|Tj)R(Tj). Note
that the class prior probabilities may be estimated by
R(Tj) = aj/a , where Aj is the number of training
samples of class Tj, and a is the total number of
training samples on Y. That is, the naive probability
assigns an unknown sample Y to the class Tj.
III. FEATURE SELECTION
TECHNIQUES
In general, Feature subset selection is a pre-
processing step used in machine learning [13]. It is
valuable in reducing dimensionality and eradicates
irrelevant data therefore it increases the learning
accuracy. It refers to the problem of identifying those
features that are useful in predicting class. Features
can be discrete, continuous or nominal. On the whole,
features are described in three types.
1)Relevant 2) Irrelevant3) Redundant. Feature
selection methods wrapper and embedded models.
Moreover, Filter model rely on analyzing the general
qualities of data and evaluating features and will not
involve any learning algorithm, where as wrapper
model uses après determined learning algorithm and
use learning algorithms performance on the provided
features in the evaluation step to identify relevant
feature. The Embedded models integrate the feature
selection as a part of the model training process.
The collection of datas from medical sources is
highly voluminous in nature. The various significant
factors distress the success of data mining on medical
data. If the data is irrelevant, redundant then
knowledge discovery during training phase is more
difficult. Figure 2 shows flow of FSS.
Figure 2: Feature Subset Selection
1. Principle Component Analysis (PCA) Feature
Selection Technique
The Principle component analysis (PCA) [14] is
a stagnant technique used in many applications like
face recognition, pattern recognition, image
compression and data mining. Further PCA is used to
shrink the dimensionality of the data consisting of a
large no. of attributes. PCA can be generalized as
multiple factor analysis and as correspondence
analysis to handle heterogeneous sets of variables and
quantitative variables respectively. The following are
the main procedure for the principle component
analysis (PCA). Scientifically PCA depends on SVD
of rectangular matrices and Eigen decomposition of
positive semi definite matrices.
Step 1: Obtain the input matrix
Step 2: Subtract the mean from the data set in all
dimensions
Step 3: Calculate covariance matrix of this mean
subtracted data set.
4. K. Sudhakar Int. Journal of Engineering Research and Applications www.ijera.com
ISSN : 2248-9622, Vol. 5, Issue 4, ( Part -2) April 2015, pp.01-06
www.ijera.com 4 | P a g e
Step 4: Calculate the Eigen values and Eigen Vector
from covariance matrix
Step 5: Form a feature vector
Step 6: Derive the new data set.
2. Information Gain Attribute Subset Feature
Selection Technique
In this method, the discernibility function is used
[12]. The discernibility function is given as follows:
For an information system (I,H), s discernibility
function DF is a boolean function of m Boolean
variables e1,e2……..en corresponding to the
attributes e1,e2..….en respectively, and defined as
follows: DF(e1,e2….en)=S1 ᴧ S2 ᴧ …..Sm where
ej€ S. The proposed algorithm for the information
gain attribute subset evaluation is defined as below:
Step 1: For selected dataset, the discernibility matrix
can be computed.
Using P[J,I]={ e € E, where X[J]≠X[I] and
Y[J]≠Y[I]} J,I=1,2,….m Eq1 Where X are
conditional attributes and Y is a decision attribute.
This discernibility matrix P is symmetric. Where
P[a,b]=P[b,a] and P[a,a]=0. Due to this, the
consideration of the lower triangle or upper triangle
of the matrix is sufficient.
Step 2: For the discernibility matrix, the discernibility
function is to be computed P[a,b] by using DF(a) = ᴧ
{ᴧ P[a,b] / a,b € I; P[a,b] ≠0} Eq2
Step 3: On applying the expansion law on the
attribute selection at least two numbers which
belongs to the large number of conjunctive sets.
Step 4: For each component, the expansion law
cannot be applied until repeat steps 1 to 3.
Step 5: For their corresponding attributes, all strongly
equivalent classes are substituted.
Step 6: By using the formula, for the simplified
discernibility function that containing the attributes,
the information gain is to be calculated.
Gain(Gi) = F(Ri) - F(Gi) Eq.3 Where
F(G) = Rj log2 𝑅𝑗𝑚
𝑗=1 ----- (4)
Rj log2 𝑅𝑗𝑚
𝑗=1 = -
𝑟1
𝑟
log2
𝑟1
𝑟
−
𝑟2
𝑟
log2
𝑟2
𝑟
…
𝑟𝑚
𝑟
log2
𝑟𝑚
𝑟
----------------- (5)
Where Rj is the ratio of conditional attribute R in
dataset. When Gi has | Gi | kinds of attribute values
and condition attribute Rj partitions set R using
attribute Gi, the value of information F(Gi) is defined
as
F(Gi) = 𝐾𝑖 ∗ 𝐹(𝐵i)𝐺𝑖
𝑖=1 -------- (6)
Step 7: The attribute with the least gain value is to be
remove from the discernbility function, where as, the
attribute with the highest value is to be added to the
reduction set. Until, the discernbility function reaches
null set, goto step 6.
IV. PROPOSED FRAMEWORK
The following figure represents the framework
of our paper for predicting the heart disease of the
patient earlier using some important tests (attributes)
by means of using artificial neural network and
feature selection techniques.
Figure 3: Proposed Framework
In this framework, the input dataset is feed into
the feature selection method block where the feature
selection is done according to the given data set, and
it will takes the method which gives the reduced
number of selected attributes from the given number
of attributes. And that selected attributes is given to
the classification techniques, and ANN is classify the
selected attributes. And that resultant dataset is
generated for predicting the heart disease in easy
manner and in earlier stage. This proposed
framework reduces the task of clustering, the
grouping of same attributes takes place by using our
framework for predicting the heart disease using
given heart disease data set.
V. DATASET AND TOOL
The data used in this study is the Hungarian
institute of cardiology. The dataset contains 303
instances and 14 attributes of the heart disease patient
[15]. The general purpose machine learning and data
mining tool is an Orange (http://orange.biolab.si). It
features a multilayer architecture suitable for
different kinds of users, from inexperienced data
mining beginners to programmers who prefer to
access the tool through its scripting interface. In the
paper we outline the history of Orange's development
and present its current state, achievements and the
future challenges. The following are the attributes of
the given heart disease dataset.
S. No Attribute
1 Age
2 Gender
3 Chest Pain
4 Rest SBP
5 Cholestrol
6 Fasting Blood
7 Rest ECG
8 Maximum HR
9 Exer Ind
10 ST by exercise
11 Slope peak exc ST
12 Major vessels colored
13 Thal
14 Diameter Narrowing
Table 1: The attributes of the heart disease dataset
5. K. Sudhakar Int. Journal of Engineering Research and Applications www.ijera.com
ISSN : 2248-9622, Vol. 5, Issue 4, ( Part -2) April 2015, pp.01-06
www.ijera.com 5 | P a g e
VI. EXPERIMENTAL RESULT AND
ANLYSIS
The dataset is fed into the orange tool for the
execution of the above framework without using
cluster technique. The attribute filtering is done for
the given heart disease dataset. The dataset contain
14 attributes and 303 instances.
Feature
Selection
Method
Total Number
of Attributes
Number of
Filtered
Attributes
Information
Gain Method
14 8
PCA 14 6
Table 2: Filtered Attributes using feature selection
method
Attribute
Filtering
Method
Filtered Attributes Our
framework
result
Information
gain
8
(5,8,10,13,3,12,4,9)
4
(3,9,12,13)
PCA 6 (3,7,9,11,12,13)
Table 3: Result of our framework using filtered
attributes
S
.
N
o
Classifiers Correctly
Classified
Instances
Incorr-
ectly
Classi-
fied
Instanc
es
Accurac
y of the
classify-
cation
1 SVM 232 71 76.45%
2 ANN 254 49 83.70%
3 Classificati
on Tree
228 75 75.25%
4 Naïve
Bayes
251 52 82.75%
Table 4: Classifiers accuracy with full dataset
Attribute
Filtering
Method
Number of attributes
filtered
CFS 3 (3,5,4)
Information Gain 10
(3,5,21,4,15,19,11,30,29,26)
Gain ratio 9 (3,5,2,21,4,15,19,11,6)
PCA 11
(3,5,21,4,15,19,2,11,30,23,29)
Table 5: Number of filtered attributes by each
attribute filtering method From the above table 2, the
attribute filtering method of PCA selects the 6
attributes and Information gain filters 8 attributes out
of 14 attributes. Table 3 gives the result of our
framework which clusters the same attributes from
the attribute filtering methods of PCA and
information. From the table 4, the classifiers accuracy
with the full dataset is more for ANN than SVM,
classification tree and Naïve Bayes algorithm. The
classification accuracy for ANN is 83.70 % and
Naïve Bayesian Classification is 82.75% than the
other methods. And from the table 5, the Information
gain and PCA attribute algorithm gives more reduced
attributes than the CFS and Gain ratio attribute
filtering methods.
VII. CONCLUSION
Classification methods are most common and
efficient one for predicting the heart disease when the
dataset contains the large number of attributes. And
feature selection is used to filtered the attributes on
their importance for the prediction. In our proposed
framework we are using PCA and Information gain
attribute filtering techniques, because these two
methods gives more number of attributes than the
others. And from the experimental result we can
conclude that the ANN and Naïve Bayes give more
classification accuracy than the others. And our
framework clusters the same attributes which are
resultant from the filtering methods and
automatically it reduces the task of clusters. So, in
our framework we are considering the ANN and
Naïve Bayes classification methods, PCA and
information attribute filtering methods are best for
the predicting heart disease using heart disease
dataset.
REFERENCES
[1] “A Safe Future- Global Public Health
Security in the 21st
Century”, The World
Health Report 2007.
[2] “Commission Staff Working Document”,
European Commission, Implementation of
Health Programme 2010.
[3] “Drugs in Australia 2010-Tobacco, Alcohol
and other drugs”, Australian Institute of
Health and Welfare.
[4] Vikas Chaurasia, Saurabh Pal, “Early
Prediction of Heart Diseases Using Data
Mining Techniques”, Carib.j. Sci Tech,
2013, Vol.1, pp.no:208-217.
[5] Hian Chye Koh and Gerald Tan, “Data
Mining Applications in Healthcare”, Journal
of Healthcare Information Management —
Vol. 19, No. 2, pp.no: 64-72.
[6] S. Saravana kumar, S.Rinesh, “Effective
Heart Disease Prediction using Frequent
Feature Selection Method”, International
Journal of Innovative Research in Computer
and Communication Engineering, Vol.2,
Special Issue 1, March 2014, pp.no: 2767-
2774.
[7] Jyoti Soni, Uzma Ansari, Dipesh Sharma
and Sunita Soni, “Intelligent and Effective
6. K. Sudhakar Int. Journal of Engineering Research and Applications www.ijera.com
ISSN : 2248-9622, Vol. 5, Issue 4, ( Part -2) April 2015, pp.01-06
www.ijera.com 6 | P a g e
Heart Disease Prediction System using
Weighted Associative Classifiers”,
International Journal on Computer Science
and Engineering (IJCSE), Vol. 3 No. 6 June
2011, pp.no:2385-2392.
[8] Aieman Quadir Siddique, Md. Saddam
Hossain, “Predicting Heart-disease from
Medical Data by Applying Naïve Bayes and
Apriori Algorithm”, International Journal of
Scientific & Engineering Research, Volume
4, Issue 10, October-2013, pp.no: 224-231.
[9] Priyanka Palod, Jayesh Gangrade, “Efficient
Model For Chd Using Association Rule With
Fds”, International Journal of Innovative
Research in Science, Engineering and
Technology, Vol. 2, Issue 6, June 2013,
pp.no:2157-2162.
[10] Shamsher Bahadur Patel, Pramod Kumar
Yadav, Dr. D. P.Shukla, “Predict the
Diagnosis of Heart Disease Patients Using
Classification Mining Techniques”, IOSR
Journal of Agriculture and Veterinary
Science (IOSR-JAVS), Volume 4, Issue 2
(Jul. - Aug. 2013), PP 61-64.
[11] D Ratnam, P Hima Bindu, V. Mallik Sai,
S.P. Rama Devi, P. Raghavendra Rao,
“Computer-Based Clinical Decision Support
System for Prediction of Heart Diseases
Using Naïve Bayes Algorithm”,
International Journal of Computer Science
and Information Technologies, Vol. 5 (2) ,
2014, pp.no: 2384-2388.
[12] Selvakumar.P, DR. Rajagopalan. S.P, “A
Survey On Neural Network Models For
Heart Disease Prediction”, Journal of
Theoretical and Applied Information
Technology, 20th September 2014. Vol. 67
No.2, pp.no:485-497.
[13] M. Anbarasi, E. Anupriya, N.CH.S. N.
Iyengar, “Enhanced Prediction of Heart
Disease with Feature Subset Selection using
Genetic Algorithm”, International Journal
of Engineering Science and Technology,
Vol. 2(10), 2010, 5370-5376.
[14] Negar Ziasabounchi and Iman N.
Askerzade, “A Comparative Study of Heart
Disease Prediction Based on Principal
Component Analysis and Clustering
Methods”, Turkish Journal of Mathematics
and Computer Science, pp.no:1-11.
[15] Heart Disease Dataset source-
ftp://ftp.ncbi.nlm.nih.gov/geo/datasets/