Kidney failure disease is being observed as a serious challenge to the medical field with its impact on a massive population of the world. Devoid of symptoms, kidney diseases are often identified too late when dialysis is needed urgently. Advanced data mining technologies can help provide alternatives to handle this situation by discovering hidden patterns and relationships in medical data. The objective of this research work is to predict kidney disease by using multiple machine learning algorithms that are Support Vector Machine (SVM), Multilayer Perceptron (MLP), Decision Tree (C4.5), Bayesian Network (BN) and K-Nearest Neighbour (K-NN). The aim of this work is to compare those algorithms and define the most efficient one(s) on the basis of multiple criteria. The database used is “Chronic Kidney Disease” implemented on the WEKA platform. From the experimental results, it is observed that MLP and C4.5 have the best rates. However, when compared with Receiver Operating Characteristic (ROC) curve, C4.5 appears to be the most efficient.
DATA MINING CLASSIFICATION ALGORITHMS FOR KIDNEY DISEASE PREDICTION IJCI JOURNAL
Data mining is a non-trivial process of categorizing valid, novel, potentially useful and ultimately understandable patterns in data. In terms, it accurately state as the extraction of information from a huge database. Data mining is a vital role in several applications such as business organizations, educational institutions, government sectors, health care industry, scientific and engineering. . In the health care
industry, the data mining is predominantly used for disease prediction. Enormous data mining techniques are existing for predicting diseases namely classification, clustering, association rules, summarizations, regression and etc. The main objective of this research work is to predict kidney diseases using classification algorithms such as Naïve Bayes and Support Vector Machine. This research work mainly
focused on finding the best classification algorithm based on the classification accuracy and execution time performance factors. From the experimental results it is observed that the performance of the SVM is better than the Naive Bayes classifier algorithm.
1 springer format chronic changed edit iqbal qcIAESIJEECS
In the present generation, majority of the people are highly affected by kidney diseases. Among them, chronic kidney is the most common life threatening disease which can be prevented by early detection.Histological grade in chronic kidney disease provides clinically important prognostic information. Therefore, machine learning techniques are applied on the information collected from previously diagnosed patients in order to discover the knowledge and patterns for making precise predictions.A large number of features exist in the raw data in which some may cause low information and error; hence feature selection techniques can be used to retrieve useful subset of features and to improve the computation performance. In this manuscript we use a set of Filter, Wrapper methods followed by Bagging and Boosting models with parameter tuning technique to classify chronic kidney disease.Capability of Bagging and Boosting classifiers are compared and the best ensemble classifier which attains high stability with better promising results is identified.
Hybrid System of Tiered Multivariate Analysis and Artificial Neural Network f...IJECEIAES
Improved system performance diagnosis of coronary heart disease becomes an important topic in research for several decades. One improvement would be done by features selection, so only the attributes that influence is used in the diagnosis system using data mining algorithms. Unfortunately, the most feature selection is done with the assumption has provided all the necessary attributes, regardless of the stage of obtaining the attribute, and cost required. This research proposes a hybrid model system for diagnosis of coronary heart disease. System diagnosis preceded the feature selection process, using tiered multivariate analysis. The analytical method used is logistic regression. The next stage, the classification by using multi-layer perceptron neural network. Based on test results, system performance proposed value for accuracy 86.3%, sensitivity 84.80%, specificity 88.20%, positive prediction value (PPV) 90.03%, negative prediction value (NPV) 81.80%, accuracy 86,30% and area under the curve (AUC) of 92.1%. The performance of a diagnosis using a combination attributes of risk factors, symptoms and exercise ECG. The conclusion that can be drawn is that the proposed diagnosis system capable of delivering performance in the very good category, with a number of attributes that are not a lot of checks and a relatively low cost.
DATA MINING CLASSIFICATION ALGORITHMS FOR KIDNEY DISEASE PREDICTION IJCI JOURNAL
Data mining is a non-trivial process of categorizing valid, novel, potentially useful and ultimately understandable patterns in data. In terms, it accurately state as the extraction of information from a huge database. Data mining is a vital role in several applications such as business organizations, educational institutions, government sectors, health care industry, scientific and engineering. . In the health care
industry, the data mining is predominantly used for disease prediction. Enormous data mining techniques are existing for predicting diseases namely classification, clustering, association rules, summarizations, regression and etc. The main objective of this research work is to predict kidney diseases using classification algorithms such as Naïve Bayes and Support Vector Machine. This research work mainly
focused on finding the best classification algorithm based on the classification accuracy and execution time performance factors. From the experimental results it is observed that the performance of the SVM is better than the Naive Bayes classifier algorithm.
1 springer format chronic changed edit iqbal qcIAESIJEECS
In the present generation, majority of the people are highly affected by kidney diseases. Among them, chronic kidney is the most common life threatening disease which can be prevented by early detection.Histological grade in chronic kidney disease provides clinically important prognostic information. Therefore, machine learning techniques are applied on the information collected from previously diagnosed patients in order to discover the knowledge and patterns for making precise predictions.A large number of features exist in the raw data in which some may cause low information and error; hence feature selection techniques can be used to retrieve useful subset of features and to improve the computation performance. In this manuscript we use a set of Filter, Wrapper methods followed by Bagging and Boosting models with parameter tuning technique to classify chronic kidney disease.Capability of Bagging and Boosting classifiers are compared and the best ensemble classifier which attains high stability with better promising results is identified.
Hybrid System of Tiered Multivariate Analysis and Artificial Neural Network f...IJECEIAES
Improved system performance diagnosis of coronary heart disease becomes an important topic in research for several decades. One improvement would be done by features selection, so only the attributes that influence is used in the diagnosis system using data mining algorithms. Unfortunately, the most feature selection is done with the assumption has provided all the necessary attributes, regardless of the stage of obtaining the attribute, and cost required. This research proposes a hybrid model system for diagnosis of coronary heart disease. System diagnosis preceded the feature selection process, using tiered multivariate analysis. The analytical method used is logistic regression. The next stage, the classification by using multi-layer perceptron neural network. Based on test results, system performance proposed value for accuracy 86.3%, sensitivity 84.80%, specificity 88.20%, positive prediction value (PPV) 90.03%, negative prediction value (NPV) 81.80%, accuracy 86,30% and area under the curve (AUC) of 92.1%. The performance of a diagnosis using a combination attributes of risk factors, symptoms and exercise ECG. The conclusion that can be drawn is that the proposed diagnosis system capable of delivering performance in the very good category, with a number of attributes that are not a lot of checks and a relatively low cost.
PERFORMANCE OF DATA MINING TECHNIQUES TO PREDICT IN HEALTHCARE CASE STUDY: CH...ijdms
With the promises of predictive analytics in big data, and the use of machine learning algorithms,
predicting future is no longer a difficult task, especially for health sector, that has witnessed a great
evolution following the development of new computer technologies that gave birth to multiple fields of
research. Many efforts are done to cope with medical data explosion on one hand, and to obtain useful
knowledge from it, predict diseases and anticipate the cure on the other hand. This prompted researchers
to apply all the technical innovations like big data analytics, predictive analytics, machine learning and
learning algorithms in order to extract useful knowledge and help in making decisions. In this paper, we
will present an overview on the evolution of big data in healthcare system, and we will apply three learning
algorithms on a set of medical data. The objective of this research work is to predict kidney disease by
using multiple machine learning algorithms that are Support Vector Machine (SVM), Decision Tree (C4.5),
and Bayesian Network (BN), and chose the most efficient one.
PREDICTIVE ANALYTICS IN HEALTHCARE SYSTEM USING DATA MINING TECHNIQUEScscpconf
The health sector has witnessed a great evolution following the development of new computer technologies, and that pushed this area to produce more medical data, which gave birth to multiple fields of research. Many efforts are done to cope with the explosion of medical data on one hand, and to obtain useful knowledge from it on the other hand. This prompted researchers to apply all the technical innovations like big data analytics, predictive analytics, machine learning and learning algorithms in order to extract useful knowledge and help in making decisions. With the promises of predictive analytics in big data, and the use of machine learning
algorithms, predicting future is no longer a difficult task, especially for medicine because predicting diseases and anticipating the cure became possible. In this paper we will present an overview on the evolution of big data in healthcare system, and we will apply a learning algorithm on a set of medical data. The objective is to predict chronic kidney diseases by using Decision Tree (C4.5) algorithm.
Predicting heart failure using a wrapper-based feature selectionnooriasukmaningtyas
In the current health system, it is very difficult for medical practitioners/ physicians to diagnose the effectiveness of heart contraction. In this research, we proposed a machine learning model to predict heart contraction using an artificial neural network (ANN). We also proposed a novel wrapper-based feature selection utilizing a grey wolf optimization (GWO) to reduce the number of required input attributes. In this work, we compared the results achieved using our method and several conventional machine learning algorithms approaches such as support vector machine, decision tree, Knearest neighbor, naïve bayes, random forest, and logistic regression. Computational results show not only that much fewer features are needed, but also higher prediction accuracy can be achieved around 87%. This work has the potential to be applicable to clinical practice and become a supporting tool for doctors/physicians.
Predicting Chronic Kidney Disease using Data Mining Techniquesijtsrd
Kidney is a significant aspect of a human body. Kidney infection or disappointments are expanded in every year. Presently a day’s chronic kidney infection is the most well known disease for the individuals. Today numerous individuals pass on due to chronic kidney disease. The principle issue of CKD is, it will influence the kidney gradually. A few people dont have side effects at all and are analysed by a lab test. It depicts the steady loss of kidney work. Early recognition and therapy are viewed as basic variables in the management and control of chronic kidney disease. Data mining techniques is utilized to extract data from clinical and laboratory, which can be useful to help doctors to recognize the seriousness stage of patients. Using Probabilistic Neural Networks PNN algorithm will get better prediction for determining the severity stage of chronic kidney disease. Seethal V | Kuldeep Baban Vayadande "Predicting Chronic Kidney Disease using Data Mining Techniques" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-1 , December 2020, URL: https://www.ijtsrd.com/papers/ijtsrd37974.pdf Paper URL : https://www.ijtsrd.com/computer-science/data-miining/37974/predicting-chronic-kidney-disease-using-data-mining-techniques/seethal-v
A CONCEPTUAL APPROACH TO ENHANCE PREDICTION OF DIABETES USING ALTERNATE FEATU...IAEMEPublication
Machine learning algorithms play a vital role in prediction of many diseases such as heart disease, diabetes, cancer, lung disease etc. The applicability of machine learning algorithms to healthcare domain relieves the burden of physicians as it is impractical to scan manually all the data collected over a period of time in order to arrive at some valuable information. Machine learning algorithms learn from the training dataset and they become capable of thinking like a human. Once the algorithm completes it learning with training dataset, it can automatically predict the target output label of any unseen data. In this work, predicting diabetes using machine learning algorithms has been taken up. A conceptual architecture has been proposed based on big data architecture.
A CONCEPTUAL APPROACH TO ENHANCE PREDICTION OF DIABETES USING ALTERNATE FEATU...IAEME Publication
Machine learning algorithms play a vital role in prediction of many diseases such as heart disease, diabetes, cancer, lung disease etc. The applicability of machine learning algorithms to healthcare domain relieves the burden of physicians as it is impractical to scan manually all the data collected over a period of time in order to arrive at some valuable information. Machine learning algorithms learn from the training dataset and they become capable of thinking like a human. Once the algorithm completes it learning with training dataset, it can automatically predict the target output label of any unseen data. In this work, predicting diabetes using machine learning algorithms has been taken up. A conceptual architecture has been proposed based on big data architecture.
LIVER DISEASE PREDICTION BY USING DIFFERENT DECISION TREE TECHNIQUESIJDKP
Early prediction of liver disease is very important to save human life and take proper steps to control the
disease. Decision Tree algorithms have been successfully applied in various fields especially in medical
science. This research work explores the early prediction of liver disease using various decision tree
techniques. The liver disease dataset which is select for this study is consisting of attributes like total
bilirubin, direct bilirubin, age, gender, total proteins, albumin and globulin ratio. The main purpose of this
work is to calculate the performance of various decision tree techniques and compare their performance.
The decision tree techniques used in this study are J48, LMT, Random Forest, Random tree, REPTree,
Decision Stump, and Hoeffding Tree. The analysis proves that Decision Stump provides the highest
accuracy than other techniques
Chronic Kidney Disease prediction is one of the most important issues in healthcare analytics. The most interesting and challenging tasks in day to day life is prediction in medical field. In this paper, we employ some machine learning techniques for predicting the chronic kidney disease using clinical data. We use three machine learning algorithms such as Decision Tree(DT) algorithm, Naive Bayesian (NB) algorithm. The performance of the above models are compared with each other in order to select the best classifier in predicting the chronic kidney disease for given dataset.
The Analysis of Performace Model Tiered Artificial Neural Network for Assessm...IJECEIAES
The assessment model of coronary heart disease is so much developed in line with the development of information technology, particularly the field of artificial intelligence. Unfortunately, the assessment models developed mostly do not use such an approach made by the clinician, that is the tiered approach. This makes the assessment process should conduct a thorough examination. This study aims to analyze the performance of a tiered model assessment. The assessment system is divided into several levels, with reference to the stages of the inspection procedure.The method used for each level is, preprocessing, building architecture artificial neural network (ANN), conduct training using the Levenberg-Marquardt algorithm and one step secant, as well as testing the system. The test results showed the influence of each level, both when the output level of the previous positive or negative, were tested back at the next level. The effect indicates that the level above gives performance improvement and or strengthens the performance at the previous level.
Supervised machine learning based liver disease prediction approach with LASS...journalBEEI
In this contemporary era, the uses of machine learning techniques are increasing rapidly in the field of medical science for detecting various diseases such as liver disease (LD). Around the globe, a large number of people die because of this deadly disease. By diagnosing the disease in a primary stage, early treatment can be helpful to cure the patient. In this research paper, a method is proposed to diagnose the LD using supervised machine learning classification algorithms, namely logistic regression, decision tree, random forest, AdaBoost, KNN, linear discriminant analysis, gradient boosting and support vector machine (SVM). We also deployed a least absolute shrinkage and selection operator (LASSO) feature selection technique on our taken dataset to suggest the most highly correlated attributes of LD. The predictions with 10 fold cross-validation (CV) made by the algorithms are tested in terms of accuracy, sensitivity, precision and f1-score values to forecast the disease. It is observed that the decision tree algorithm has the best performance score where accuracy, precision, sensitivity and f1-score values are 94.295%, 92%, 99% and 96% respectively with the inclusion of LASSO. Furthermore, a comparison with recent studies is shown to prove the significance of the proposed system.
Machine learning approach for predicting heart and diabetes diseases using da...IAESIJAI
Environmental changes and food habits affect people's health with numerous diseases in today's life. Machine learning is a technique that plays a vital role in predicting diseases from collected data. The health sector has plenty of electronic medical data, which helps this technique to diagnose various diseases quickly and accurately. There has been an improvement in accuracy in medical data analysis as data continues to grow in the medical field. Doctors may have a hard time predicting symptoms accurately. This proposed work utilized Kaggle data to predict and diagnose heart and diabetic diseases. The diseases heart and diabetes are the foremost cause of higher death rates for people. The dataset contains target features for the diagnosis of heart disease. This work finds the target variable for diabetic disease by comparing the patient's blood sugars to normal levels. Blood pressure, body mass index (BMI), and other factors diagnose these diseases and disorders. This work justifies the filter method and principal component analysis for selecting and extracting the feature. The main aim of this work is to highlight the implementation of three ensemble techniques-Adaptive boost, Extreme Gradient boosting, and Gradient boosting-as well as the emphasis placed on the accuracy of the results.
Submission Deadline: 30th September 2022
Acceptance Notification: Within Three Days’ time period
Online Publication: Within 24 Hrs. time Period
Expected Date of Dispatch of Printed Journal: 5th October 2022
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...IAEME Publication
White layer thickness (WLT) formed and surface roughness in wire electric discharge turning (WEDT) of tungsten carbide composite has been made to model through response surface methodology (RSM). A Taguchi’s standard Design of experiments involving five input variables with three levels has been employed to establish a mathematical model between input parameters and responses. Percentage of cobalt content, spindle speed, Pulse on-time, wire feed and pulse off-time were changed during the experimental tests based on the Taguchi’s orthogonal array L27 (3^13). Analysis of variance (ANOVA) revealed that the mathematical models obtained can adequately describe performance within the parameters of the factors considered. There was a good agreement between the experimental and predicted values in this study.
More Related Content
Similar to APPLICATION OF DATA MINING TECHNIQUES FOR THE PREDICTION OF CHRONIC KIDNEY DISEASE
PERFORMANCE OF DATA MINING TECHNIQUES TO PREDICT IN HEALTHCARE CASE STUDY: CH...ijdms
With the promises of predictive analytics in big data, and the use of machine learning algorithms,
predicting future is no longer a difficult task, especially for health sector, that has witnessed a great
evolution following the development of new computer technologies that gave birth to multiple fields of
research. Many efforts are done to cope with medical data explosion on one hand, and to obtain useful
knowledge from it, predict diseases and anticipate the cure on the other hand. This prompted researchers
to apply all the technical innovations like big data analytics, predictive analytics, machine learning and
learning algorithms in order to extract useful knowledge and help in making decisions. In this paper, we
will present an overview on the evolution of big data in healthcare system, and we will apply three learning
algorithms on a set of medical data. The objective of this research work is to predict kidney disease by
using multiple machine learning algorithms that are Support Vector Machine (SVM), Decision Tree (C4.5),
and Bayesian Network (BN), and chose the most efficient one.
PREDICTIVE ANALYTICS IN HEALTHCARE SYSTEM USING DATA MINING TECHNIQUEScscpconf
The health sector has witnessed a great evolution following the development of new computer technologies, and that pushed this area to produce more medical data, which gave birth to multiple fields of research. Many efforts are done to cope with the explosion of medical data on one hand, and to obtain useful knowledge from it on the other hand. This prompted researchers to apply all the technical innovations like big data analytics, predictive analytics, machine learning and learning algorithms in order to extract useful knowledge and help in making decisions. With the promises of predictive analytics in big data, and the use of machine learning
algorithms, predicting future is no longer a difficult task, especially for medicine because predicting diseases and anticipating the cure became possible. In this paper we will present an overview on the evolution of big data in healthcare system, and we will apply a learning algorithm on a set of medical data. The objective is to predict chronic kidney diseases by using Decision Tree (C4.5) algorithm.
Predicting heart failure using a wrapper-based feature selectionnooriasukmaningtyas
In the current health system, it is very difficult for medical practitioners/ physicians to diagnose the effectiveness of heart contraction. In this research, we proposed a machine learning model to predict heart contraction using an artificial neural network (ANN). We also proposed a novel wrapper-based feature selection utilizing a grey wolf optimization (GWO) to reduce the number of required input attributes. In this work, we compared the results achieved using our method and several conventional machine learning algorithms approaches such as support vector machine, decision tree, Knearest neighbor, naïve bayes, random forest, and logistic regression. Computational results show not only that much fewer features are needed, but also higher prediction accuracy can be achieved around 87%. This work has the potential to be applicable to clinical practice and become a supporting tool for doctors/physicians.
Predicting Chronic Kidney Disease using Data Mining Techniquesijtsrd
Kidney is a significant aspect of a human body. Kidney infection or disappointments are expanded in every year. Presently a day’s chronic kidney infection is the most well known disease for the individuals. Today numerous individuals pass on due to chronic kidney disease. The principle issue of CKD is, it will influence the kidney gradually. A few people dont have side effects at all and are analysed by a lab test. It depicts the steady loss of kidney work. Early recognition and therapy are viewed as basic variables in the management and control of chronic kidney disease. Data mining techniques is utilized to extract data from clinical and laboratory, which can be useful to help doctors to recognize the seriousness stage of patients. Using Probabilistic Neural Networks PNN algorithm will get better prediction for determining the severity stage of chronic kidney disease. Seethal V | Kuldeep Baban Vayadande "Predicting Chronic Kidney Disease using Data Mining Techniques" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-1 , December 2020, URL: https://www.ijtsrd.com/papers/ijtsrd37974.pdf Paper URL : https://www.ijtsrd.com/computer-science/data-miining/37974/predicting-chronic-kidney-disease-using-data-mining-techniques/seethal-v
A CONCEPTUAL APPROACH TO ENHANCE PREDICTION OF DIABETES USING ALTERNATE FEATU...IAEMEPublication
Machine learning algorithms play a vital role in prediction of many diseases such as heart disease, diabetes, cancer, lung disease etc. The applicability of machine learning algorithms to healthcare domain relieves the burden of physicians as it is impractical to scan manually all the data collected over a period of time in order to arrive at some valuable information. Machine learning algorithms learn from the training dataset and they become capable of thinking like a human. Once the algorithm completes it learning with training dataset, it can automatically predict the target output label of any unseen data. In this work, predicting diabetes using machine learning algorithms has been taken up. A conceptual architecture has been proposed based on big data architecture.
A CONCEPTUAL APPROACH TO ENHANCE PREDICTION OF DIABETES USING ALTERNATE FEATU...IAEME Publication
Machine learning algorithms play a vital role in prediction of many diseases such as heart disease, diabetes, cancer, lung disease etc. The applicability of machine learning algorithms to healthcare domain relieves the burden of physicians as it is impractical to scan manually all the data collected over a period of time in order to arrive at some valuable information. Machine learning algorithms learn from the training dataset and they become capable of thinking like a human. Once the algorithm completes it learning with training dataset, it can automatically predict the target output label of any unseen data. In this work, predicting diabetes using machine learning algorithms has been taken up. A conceptual architecture has been proposed based on big data architecture.
LIVER DISEASE PREDICTION BY USING DIFFERENT DECISION TREE TECHNIQUESIJDKP
Early prediction of liver disease is very important to save human life and take proper steps to control the
disease. Decision Tree algorithms have been successfully applied in various fields especially in medical
science. This research work explores the early prediction of liver disease using various decision tree
techniques. The liver disease dataset which is select for this study is consisting of attributes like total
bilirubin, direct bilirubin, age, gender, total proteins, albumin and globulin ratio. The main purpose of this
work is to calculate the performance of various decision tree techniques and compare their performance.
The decision tree techniques used in this study are J48, LMT, Random Forest, Random tree, REPTree,
Decision Stump, and Hoeffding Tree. The analysis proves that Decision Stump provides the highest
accuracy than other techniques
Chronic Kidney Disease prediction is one of the most important issues in healthcare analytics. The most interesting and challenging tasks in day to day life is prediction in medical field. In this paper, we employ some machine learning techniques for predicting the chronic kidney disease using clinical data. We use three machine learning algorithms such as Decision Tree(DT) algorithm, Naive Bayesian (NB) algorithm. The performance of the above models are compared with each other in order to select the best classifier in predicting the chronic kidney disease for given dataset.
The Analysis of Performace Model Tiered Artificial Neural Network for Assessm...IJECEIAES
The assessment model of coronary heart disease is so much developed in line with the development of information technology, particularly the field of artificial intelligence. Unfortunately, the assessment models developed mostly do not use such an approach made by the clinician, that is the tiered approach. This makes the assessment process should conduct a thorough examination. This study aims to analyze the performance of a tiered model assessment. The assessment system is divided into several levels, with reference to the stages of the inspection procedure.The method used for each level is, preprocessing, building architecture artificial neural network (ANN), conduct training using the Levenberg-Marquardt algorithm and one step secant, as well as testing the system. The test results showed the influence of each level, both when the output level of the previous positive or negative, were tested back at the next level. The effect indicates that the level above gives performance improvement and or strengthens the performance at the previous level.
Supervised machine learning based liver disease prediction approach with LASS...journalBEEI
In this contemporary era, the uses of machine learning techniques are increasing rapidly in the field of medical science for detecting various diseases such as liver disease (LD). Around the globe, a large number of people die because of this deadly disease. By diagnosing the disease in a primary stage, early treatment can be helpful to cure the patient. In this research paper, a method is proposed to diagnose the LD using supervised machine learning classification algorithms, namely logistic regression, decision tree, random forest, AdaBoost, KNN, linear discriminant analysis, gradient boosting and support vector machine (SVM). We also deployed a least absolute shrinkage and selection operator (LASSO) feature selection technique on our taken dataset to suggest the most highly correlated attributes of LD. The predictions with 10 fold cross-validation (CV) made by the algorithms are tested in terms of accuracy, sensitivity, precision and f1-score values to forecast the disease. It is observed that the decision tree algorithm has the best performance score where accuracy, precision, sensitivity and f1-score values are 94.295%, 92%, 99% and 96% respectively with the inclusion of LASSO. Furthermore, a comparison with recent studies is shown to prove the significance of the proposed system.
Machine learning approach for predicting heart and diabetes diseases using da...IAESIJAI
Environmental changes and food habits affect people's health with numerous diseases in today's life. Machine learning is a technique that plays a vital role in predicting diseases from collected data. The health sector has plenty of electronic medical data, which helps this technique to diagnose various diseases quickly and accurately. There has been an improvement in accuracy in medical data analysis as data continues to grow in the medical field. Doctors may have a hard time predicting symptoms accurately. This proposed work utilized Kaggle data to predict and diagnose heart and diabetic diseases. The diseases heart and diabetes are the foremost cause of higher death rates for people. The dataset contains target features for the diagnosis of heart disease. This work finds the target variable for diabetic disease by comparing the patient's blood sugars to normal levels. Blood pressure, body mass index (BMI), and other factors diagnose these diseases and disorders. This work justifies the filter method and principal component analysis for selecting and extracting the feature. The main aim of this work is to highlight the implementation of three ensemble techniques-Adaptive boost, Extreme Gradient boosting, and Gradient boosting-as well as the emphasis placed on the accuracy of the results.
Similar to APPLICATION OF DATA MINING TECHNIQUES FOR THE PREDICTION OF CHRONIC KIDNEY DISEASE (20)
Submission Deadline: 30th September 2022
Acceptance Notification: Within Three Days’ time period
Online Publication: Within 24 Hrs. time Period
Expected Date of Dispatch of Printed Journal: 5th October 2022
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...IAEME Publication
White layer thickness (WLT) formed and surface roughness in wire electric discharge turning (WEDT) of tungsten carbide composite has been made to model through response surface methodology (RSM). A Taguchi’s standard Design of experiments involving five input variables with three levels has been employed to establish a mathematical model between input parameters and responses. Percentage of cobalt content, spindle speed, Pulse on-time, wire feed and pulse off-time were changed during the experimental tests based on the Taguchi’s orthogonal array L27 (3^13). Analysis of variance (ANOVA) revealed that the mathematical models obtained can adequately describe performance within the parameters of the factors considered. There was a good agreement between the experimental and predicted values in this study.
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURSIAEME Publication
The study explores the reasons for a transgender to become entrepreneurs. In this study transgender entrepreneur was taken as independent variable and reasons to become as dependent variable. Data were collected through a structured questionnaire containing a five point Likert Scale. The study examined the data of 30 transgender entrepreneurs in Salem Municipal Corporation of Tamil Nadu State, India. Simple Random sampling technique was used. Garrett Ranking Technique (Percentile Position, Mean Scores) was used as the analysis for the present study to identify the top 13 stimulus factors for establishment of trans entrepreneurial venture. Economic advancement of a nation is governed upon the upshot of a resolute entrepreneurial doings. The conception of entrepreneurship has stretched and materialized to the socially deflated uncharted sections of transgender community. Presently transgenders have smashed their stereotypes and are making recent headlines of achievements in various fields of our Indian society. The trans-community is gradually being observed in a new light and has been trying to achieve prospective growth in entrepreneurship. The findings of the research revealed that the optimistic changes are taking place to change affirmative societal outlook of the transgender for entrepreneurial ventureship. It also laid emphasis on other transgenders to renovate their traditional living. The paper also highlights that legislators, supervisory body should endorse an impartial canons and reforms in Tamil Nadu Transgender Welfare Board Association.
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURSIAEME Publication
Since ages gender difference is always a debatable theme whether caused by nature, evolution or environment. The birth of a transgender is dreadful not only for the child but also for their parents. The pain of living in the wrong physique and treated as second class victimized citizen is outrageous and fully harboured with vicious baseless negative scruples. For so long, social exclusion had perpetuated inequality and deprivation experiencing ingrained malign stigma and besieged victims of crime or violence across their life spans. They are pushed into the murky way of life with a source of eternal disgust, bereft sexual potency and perennial fear. Although they are highly visible but very little is known about them. The common public needs to comprehend the ravaged arrogance on these insensitive souls and assist in integrating them into the mainstream by offering equal opportunity, treat with humanity and respect their dignity. Entrepreneurship in the current age is endorsing the gender fairness movement. Unstable careers and economic inadequacy had inclined one of the gender variant people called Transgender to become entrepreneurs. These tiny budding entrepreneurs resulted in economic transition by means of employment, free from the clutches of stereotype jobs, raised standard of living and handful of financial empowerment. Besides all these inhibitions, they were able to witness a platform for skill set development that ignited them to enter into entrepreneurial domain. This paper epitomizes skill sets involved in trans-entrepreneurs of Thoothukudi Municipal Corporation of Tamil Nadu State and is a groundbreaking determination to sightsee various skills incorporated and the impact on entrepreneurship.
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONSIAEME Publication
The banking and financial services industries are experiencing increased technology penetration. Among them, the banking industry has made technological advancements to better serve the general populace. The economy focused on transforming the banking sector's system into a cashless, paperless, and faceless one. The researcher wants to evaluate the user's intention for utilising a mobile banking application. The study also examines the variables affecting the user's behaviour intention when selecting specific applications for financial transactions. The researcher employed a well-structured questionnaire and a descriptive study methodology to gather the respondents' primary data utilising the snowball sampling technique. The study includes variables like performance expectations, effort expectations, social impact, enabling circumstances, and perceived risk. Each of the aforementioned variables has a major impact on how users utilise mobile banking applications. The outcome will assist the service provider in comprehending the user's history with mobile banking applications.
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONSIAEME Publication
Technology upgradation in banking sector took the economy to view that payment mode towards online transactions using mobile applications. This system enabled connectivity between banks, Merchant and user in a convenient mode. there are various applications used for online transactions such as Google pay, Paytm, freecharge, mobikiwi, oxygen, phonepe and so on and it also includes mobile banking applications. The study aimed at evaluating the predilection of the user in adopting digital transaction. The study is descriptive in nature. The researcher used random sample techniques to collect the data. The findings reveal that mobile applications differ with the quality of service rendered by Gpay and Phonepe. The researcher suggest the Phonepe application should focus on implementing the application should be user friendly interface and Gpay on motivating the users to feel the importance of request for money and modes of payments in the application.
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINOIAEME Publication
The prototype of a voice-based ATM for visually impaired using Arduino is to help people who are blind. This uses RFID cards which contain users fingerprint encrypted on it and interacts with the users through voice commands. ATM operates when sensor detects the presence of one person in the cabin. After scanning the RFID card, it will ask to select the mode like –normal or blind. User can select the respective mode through voice input, if blind mode is selected the balance check or cash withdraw can be done through voice input. Normal mode procedure is same as the existing ATM.
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...IAEME Publication
There is increasing acceptability of emotional intelligence as a major factor in personality assessment and effective human resource management. Emotional intelligence as the ability to build capacity, empathize, co-operate, motivate and develop others cannot be divorced from both effective performance and human resource management systems. The human person is crucial in defining organizational leadership and fortunes in terms of challenges and opportunities and walking across both multinational and bilateral relationships. The growing complexity of the business world requires a great deal of self-confidence, integrity, communication, conflict and diversity management to keep the global enterprise within the paths of productivity and sustainability. Using the exploratory research design and 255 participants the result of this original study indicates strong positive correlation between emotional intelligence and effective human resource management. The paper offers suggestions on further studies between emotional intelligence and human capital development and recommends for conflict management as an integral part of effective human resource management.
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMYIAEME Publication
Our life journey, in general, is closely defined by the way we understand the meaning of why we coexist and deal with its challenges. As we develop the "inspiration economy", we could say that nearly all of the challenges we have faced are opportunities that help us to discover the rest of our journey. In this note paper, we explore how being faced with the opportunity of being a close carer for an aging parent with dementia brought intangible discoveries that changed our insight of the meaning of the rest of our life journey.
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...IAEME Publication
The main objective of this study is to analyze the impact of aspects of Organizational Culture on the Effectiveness of the Performance Management System (PMS) in the Health Care Organization at Thanjavur. Organizational Culture and PMS play a crucial role in present-day organizations in achieving their objectives. PMS needs employees’ cooperation to achieve its intended objectives. Employees' cooperation depends upon the organization’s culture. The present study uses exploratory research to examine the relationship between the Organization's culture and the Effectiveness of the Performance Management System. The study uses a Structured Questionnaire to collect the primary data. For this study, Thirty-six non-clinical employees were selected from twelve randomly selected Health Care organizations at Thanjavur. Thirty-two fully completed questionnaires were received.
Living in 21st century in itself reminds all of us the necessity of police and its administration. As more and more we are entering into the modern society and culture, the more we require the services of the so called ‘Khaki Worthy’ men i.e., the police personnel. Whether we talk of Indian police or the other nation’s police, they all have the same recognition as they have in India. But as already mentioned, their services and requirements are different after the like 26th November, 2008 incidents, where they without saving their own lives has sacrificed themselves without any hitch and without caring about their respective family members and wards. In other words, they are like our heroes and mentors who can guide us from the darkness of fear, militancy, corruption and other dark sides of life and so on. Now the question arises, if Gandhi would have been alive today, what would have been his reaction/opinion to the police and its functioning? Would he have some thing different in his mind now what he had been in his mind before the partition or would he be going to start some Satyagraha in the form of some improvement in the functioning of the police administration? Really these questions or rather night mares can come to any one’s mind, when there is too much confusion is prevailing in our minds, when there is too much corruption in the society and when the polices working is also in the questioning because of one or the other case throughout the India. It is matter of great concern that we have to thing over our administration and our practical approach because the police personals are also like us, they are part and parcel of our society and among one of us, so why we all are pin pointing towards them.
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...IAEME Publication
The goal of this study was to see how talent management affected employee retention in the selected IT organizations in Chennai. The fundamental issue was the difficulty to attract, hire, and retain talented personnel who perform well and the gap between supply and demand of talent acquisition and retaining them within the firms. The study's main goals were to determine the impact of talent management on employee retention in IT companies in Chennai, investigate talent management strategies that IT companies could use to improve talent acquisition, performance management, career planning and formulate retention strategies that the IT firms could use. The respondents were given a structured close-ended questionnaire with the 5 Point Likert Scale as part of the study's quantitative research design. The target population consisted of 289 IT professionals. The questionnaires were distributed and collected by the researcher directly. The Statistical Package for Social Sciences (SPSS) was used to collect and analyse the questionnaire responses. Hypotheses that were formulated for the various areas of the study were tested using a variety of statistical tests. The key findings of the study suggested that talent management had an impact on employee retention. The studies also found that there is a clear link between the implementation of talent management and retention measures. Management should provide enough training and development for employees, clarify job responsibilities, provide adequate remuneration packages, and recognise employees for exceptional performance.
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...IAEME Publication
Globally, Millions of dollars were spent by the organizations for employing skilled Information Technology (IT) professionals. It is costly to replace unskilled employees with IT professionals possessing technical skills and competencies that aid in interconnecting the business processes. The organization’s employment tactics were forced to alter by globalization along with technological innovations as they consistently diminish to remain lean, outsource to concentrate on core competencies along with restructuring/reallocate personnel to gather efficiency. As other jobs, organizations or professions have become reasonably more appropriate in a shifting employment landscape, the above alterations trigger both involuntary as well as voluntary turnover. The employee view on jobs is also afflicted by the COVID-19 pandemic along with the employee-driven labour market. So, having effective strategies is necessary to tackle the withdrawal rate of employees. By associating Emotional Intelligence (EI) along with Talent Management (TM) in the IT industry, the rise in attrition rate was analyzed in this study. Only 303 respondents were collected out of 350 participants to whom questionnaires were distributed. From the employees of IT organizations located in Bangalore (India), the data were congregated. A simple random sampling methodology was employed to congregate data as of the respondents. Generating the hypothesis along with testing is eventuated. The effect of EI and TM along with regression analysis between TM and EI was analyzed. The outcomes indicated that employee and Organizational Performance (OP) were elevated by effective EI along with TM.
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...IAEME Publication
By implementing talent management strategy, organizations would have the option to retain their skilled professionals while additionally working on their overall performance. It is the course of appropriately utilizing the ideal individuals, setting them up for future top positions, exploring and dealing with their performance, and holding them back from leaving the organization. It is employee performance that determines the success of every organization. The firm quickly obtains an upper hand over its rivals in the event that its employees having particular skills that cannot be duplicated by the competitors. Thus, firms are centred on creating successful talent management practices and processes to deal with the unique human resources. Firms are additionally endeavouring to keep their top/key staff since on the off chance that they leave; the whole store of information leaves the firm's hands. The study's objective was to determine the impact of talent management on organizational performance among the selected IT organizations in Chennai. The study recommends that talent management limitedly affects performance. On the off chance that this talent is appropriately management and implemented properly, organizations might benefit as much as possible from their maintained assets to support development and productivity, both monetarily and non-monetarily.
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...IAEME Publication
Banking regulations act of India, 1949 defines banking as “acceptance of deposits for the purpose of lending or investment from the public, repayment on demand or otherwise and withdrawable through cheques, drafts order or otherwise”, the major participants of the Indian financial system are commercial banks, the financial institution encompassing term lending institutions. Investments institutions, specialized financial institution and the state level development banks, non banking financial companies (NBFC) and other market intermediaries such has the stock brokers and money lenders are among the oldest of the certain variants of NBFC and the oldest market participants. The asset quality of banks is one of the most important indicators of their financial health. The Indian banking sector has been facing severe problems of increasing Non- Performing Assets (NPAs). The NPAs growth directly and indirectly affects the quality of assets and profitability of banks. It also shows the efficiency of banks credit risk management and the recovery effectiveness. NPA do not generate any income, whereas, the bank is required to make provisions for such as assets that why is a double edge weapon. This paper outlines the concept of quality of bank loans of different types like Housing, Agriculture and MSME loans in state Haryana of selected public and private sector banks. This study is highlighting problems associated with the role of commercial bank in financing Small and Medium Scale Enterprises (SME). The overall objective of the research was to assess the effect of the financing provisions existing for the setting up and operations of MSMEs in the country and to generate recommendations for more robust financing mechanisms for successful operation of the MSMEs, in turn understanding the impact of MSME loans on financial institutions due to NPA. There are many research conducted on the topic of Non- Performing Assets (NPA) Management, concerning particular bank, comparative study of public and private banks etc. In this paper the researcher is considering the aggregate data of selected public sector and private sector banks and attempts to compare the NPA of Housing, Agriculture and MSME loans in state Haryana of public and private sector banks. The tools used in the study are average and Anova test and variance. The findings reveal that NPA is common problem for both public and private sector banks and is associated with all types of loans either that is housing loans, agriculture loans and loans to SMES. NPAs of both public and private sector banks show the increasing trend. In 2010-11 GNPA of public and private sector were at same level it was 2% but after 2010-11 it increased in many fold and at present there is GNPA in some more than 15%. It shows the dark area of Indian banking sector.
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...IAEME Publication
An experiment conducted in this study found that BaSO4 changed Nylon 6's mechanical properties. By changing the weight ratios, BaSO4 was used to make Nylon 6. This Researcher looked into how hard Nylon-6/BaSO4 composites are and how well they wear. Experiments were done based on Taguchi design L9. Nylon-6/BaSO4 composites can be tested for their hardness number using a Rockwell hardness testing apparatus. On Nylon/BaSO4, the wear behavior was measured by a wear monitor, pinon-disc friction by varying reinforcement, sliding speed, and sliding distance, and the microstructure of the crack surfaces was observed by SEM. This study provides significant contributions to ultimate strength by increasing BaSO4 content up to 16% in the composites, and sliding speed contributes 72.45% to the wear rate
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...IAEME Publication
The majority of the population in India lives in villages. The village is the back bone of the country. Village or rural industries play an important role in the national economy, particularly in the rural development. Developing the rural economy is one of the key indicators towards a country’s success. Whether it be the need to look after the welfare of the farmers or invest in rural infrastructure, Governments have to ensure that rural development isn’t compromised. The economic development of our country largely depends on the progress of rural areas and the standard of living of rural masses. Village or rural industries play an important role in the national economy, particularly in the rural development. Rural entrepreneurship is based on stimulating local entrepreneurial talent and the subsequent growth of indigenous enterprises. It recognizes opportunity in the rural areas and accelerates a unique blend of resources either inside or outside of agriculture. Rural entrepreneurship brings an economic value to the rural sector by creating new methods of production, new markets, new products and generate employment opportunities thereby ensuring continuous rural development. Social Entrepreneurship has the direct and primary objective of serving the society along with the earning profits. So, social entrepreneurship is different from the economic entrepreneurship as its basic objective is not to earn profits but for providing innovative solutions to meet the society needs which are not taken care by majority of the entrepreneurs as they are in the business for profit making as a sole objective. So, the Social Entrepreneurs have the huge growth potential particularly in the developing countries like India where we have huge societal disparities in terms of the financial positions of the population. Still 22 percent of the Indian population is below the poverty line and also there is disparity among the rural & urban population in terms of families living under BPL. 25.7 percent of the rural population & 13.7 percent of the urban population is under BPL which clearly shows the disparity of the poor people in the rural and urban areas. The need to develop social entrepreneurship in agriculture is dictated by a large number of social problems. Such problems include low living standards, unemployment, and social tension. The reasons that led to the emergence of the practice of social entrepreneurship are the above factors. The research problem lays upon disclosing the importance of role of social entrepreneurship in rural development of India. The paper the tendencies of social entrepreneurship in India, to present successful examples of such business for providing recommendations how to improve situation in rural areas in terms of social entrepreneurship development. Indian government has made some steps towards development of social enterprises, social entrepreneurship, and social in- novation, but a lot remains to be improved.
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...IAEME Publication
Distribution system is a critical link between the electric power distributor and the consumers. Most of the distribution networks commonly used by the electric utility is the radial distribution network. However in this type of network, it has technical issues such as enormous power losses which affect the quality of the supply. Nowadays, the introduction of Distributed Generation (DG) units in the system help improve and support the voltage profile of the network as well as the performance of the system components through power loss mitigation. In this study network reconfiguration was done using two meta-heuristic algorithms Particle Swarm Optimization and Gravitational Search Algorithm (PSO-GSA) to enhance power quality and voltage profile in the system when simultaneously applied with the DG units. Backward/Forward Sweep Method was used in the load flow analysis and simulated using the MATLAB program. Five cases were considered in the Reconfiguration based on the contribution of DG units. The proposed method was tested using IEEE 33 bus system. Based on the results, there was a voltage profile improvement in the system from 0.9038 p.u. to 0.9594 p.u.. The integration of DG in the network also reduced power losses from 210.98 kW to 69.3963 kW. Simulated results are drawn to show the performance of each case.
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...IAEME Publication
Manufacturing industries have witnessed an outburst in productivity. For productivity improvement manufacturing industries are taking various initiatives by using lean tools and techniques. However, in different manufacturing industries, frugal approach is applied in product design and services as a tool for improvement. Frugal approach contributed to prove less is more and seems indirectly contributing to improve productivity. Hence, there is need to understand status of frugal approach application in manufacturing industries. All manufacturing industries are trying hard and putting continuous efforts for competitive existence. For productivity improvements, manufacturing industries are coming up with different effective and efficient solutions in manufacturing processes and operations. To overcome current challenges, manufacturing industries have started using frugal approach in product design and services. For this study, methodology adopted with both primary and secondary sources of data. For primary source interview and observation technique is used and for secondary source review has done based on available literatures in website, printed magazines, manual etc. An attempt has made for understanding application of frugal approach with the study of manufacturing industry project. Manufacturing industry selected for this project study is Mahindra and Mahindra Ltd. This paper will help researcher to find the connections between the two concepts productivity improvement and frugal approach. This paper will help to understand significance of frugal approach for productivity improvement in manufacturing industry. This will also help to understand current scenario of frugal approach in manufacturing industry. In manufacturing industries various process are involved to deliver the final product. In the process of converting input in to output through manufacturing process productivity plays very critical role. Hence this study will help to evolve status of frugal approach in productivity improvement programme. The notion of frugal can be viewed as an approach towards productivity improvement in manufacturing industries.
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENTIAEME Publication
In this paper, we investigated a queuing model of fuzzy environment-based a multiple channel queuing model (M/M/C) ( /FCFS) and study its performance under realistic conditions. It applies a nonagonal fuzzy number to analyse the relevant performance of a multiple channel queuing model (M/M/C) ( /FCFS). Based on the sub interval average ranking method for nonagonal fuzzy number, we convert fuzzy number to crisp one. Numerical results reveal that the efficiency of this method. Intuitively, the fuzzy environment adapts well to a multiple channel queuing models (M/M/C) ( /FCFS) are very well.
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...2023240532
Quantitative data Analysis
Overview
Reliability Analysis (Cronbach Alpha)
Common Method Bias (Harman Single Factor Test)
Frequency Analysis (Demographic)
Descriptive Analysis
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
2. Application of Data Mining Techniques for the Prediction of Chronic Kidney Disease
https://iaeme.com/Home/journal/IJARET 3502 editor@iaeme.com
Data mining classification techniques play a vital role in healthcare domain by classifying
the patient dataset [9], [10]. Data mining classification technique is used to analyse and predict
many diseases. The classification techniques like artificial neural network (ANN), K-nearest
neighbor (KNN), naïve Bays, decision tree (J48, C4.5), support vector machine (SVM) etc. Are
used by many researchers in the health care area for analysis, detect and predict for a variety of
diseases. Feature selection methods, improve the performance accuracy of the algorithm by
reducing the dimensionality of the feature and it can be grouped into a wrapper and filter
methods [11]. In developing country, most of the kidney patient received treatment after
reached in serious cases. This increases the number of CKD patients [12]. CKD can be reduced
even can stop by diagnosis before affected and during affected by doing the test like the blood
test, urine test, kidney scan and ask doctor other symptoms of kidney disease.
Most of the works are mainly concentrations of analysis, prediction life danger diseases like
chronic kidney disease, cancer disease, diabetic disease, etc. Using feature selection and
classification techniques such as KNN, ANN, decision tree (J48/C4.5), SVM, naïve Bays, etc.
2. RELATED WORKS
K.R. Lakshmi et al [13] This work is compared to the performance of Bayesian classifier,
support vector machine (SVM) classifier and K-nearest neighbour based on accuracy, accuracy
and execution time CKD prediction (KNN).
Lambodar Jena et al [14] The main objective of this paper is to use data from this data set
to predict the classification of chronic kidney disease in each case accurately.
S.Vijayarani et al [15] The study was to predict kidney disease using vector-based vector
machines (SVM) and artificial neural networks (ANN). The aim of this study is to compare the
performance of the two algorithms based on accuracy and run-time. From the experimental
results it shows that the yield of RNA is better than other algorithms.
Boukenze et al [16] The development of large data sets in the health system is outlined and
used in a collection of medical data using three learning algorithms. The goal of this study was
to predict kidney disease by using multiple machine learning algorithms, SVM, C4.5 and
Bayesian Networks (BNs), and selected efficient.
Yadollahpour, Ali, et al [17] have proposed an adaptive neurofuzzy inference system
(ANFIS) for predicting the renal failure timeframe of CKD based on real clinical data, methods
used was clinical study records up to 10-year data were collected from newly diagnosed CKD
patients.
Sedighi Z. et al. [18] chronic kidney disease is a common disease prevented by early
detection and cure. Practical guidance needs classification of kidney disease as a global
improvement, data mining, and machine learning techniques supports to discover knowledge
in identifying patterns for classification.
3. PROPOSED FRAMEWORK FOR CLASSIFICATION OF CHRONIC
KIDNEY DISEASE
Figure 1 depicts the proposed framework for the classification of Chronic Kidney Disease using
Data Mining techniques like Feature Selection and Classification. The figure 1 depicts the
proposed research methodology for the classification of Chronic Kidney Disease. This
framework is composed two major stages. 1) Feature Selection: In this stage, the irrelevant,
redundant features are removed, 2) Classification of patient into categories Yes and NO
category. For the feature selection, Correlation based feature selection with PSO search
optimization has utilized. In the classification stage, the evaluation of the feature selection
3. B. Karthikeyan
https://iaeme.com/Home/journal/IJARET 3503 editor@iaeme.com
method has done with the K-Nearest Neighbor (K-NN), Naïve Bayes (NB), Support Vector
Machine (SVM), Decision Tree (DT) and Artificial Neural Network (ANN).
Figure 1 Proposed Framework for the classification of Chronic Kidney Disease using Data Mining
techniques
3.1. Pre-Processing Step: Feature Selection Technique
Feature subset selection aims to reduce computing time and improve the results of prediction
of machine learning algorithms. This is done by reducing the features/attributes in a dataset that
are considered unimportant or unable to contribute positively towards the classification, but it
does not create new feature. Fewer features will reduces computing time.
Correlation-based Feature Selection (CFS) is suitable to be applied to multivariate data.
CFS works by calculating the interaction between features. CFS evaluates a subset of features
taking into account predictive capabilities of each level of redundancy among features and those
features. The correlation coefficient is a feature used to calculate the correlation between the
subset of features with feature classes and inter correlation among other features.
Particle Swarm Optimization (PSO) is based on the social behavior associated with bird’s
flocking for optimization problem. A social behavior pattern of organisms that live and interact
within large groups is the inspiration for PSO. The PSO is easier to lay into operation than
Genetic Algorithm. It is for the motivation that PSO doesn’t have mutation or crossover
operators and movement of particles is affected by using velocity function. In PSO, every
particle alters its own flying memory and its partner's flying inclusion keeping in mind the end
goal to flying in the search space with velocity.
The best-fit particle of the entire swarm influences the position of each particle. Each
individual particle j ∈ [1 … 𝑚] where m > 1, has current position in search space sj, a current
velocity uj and a personal best position pb,j where j is the smallest value determined by objective
function o. By using pb,j the global best position Gb is calculated, which is the buck value
obtained by comparing all the pb,j
The pb,j is calculated by using the formula
pb,j= {
𝑝𝑏,𝑗 𝑖𝑓 (𝑦𝑗) > 𝑝𝑏,𝑗
𝑦𝑗 𝑖𝑓 𝑓(𝑦𝑗) ≤ 𝑝𝑏,𝑗
4. Application of Data Mining Techniques for the Prediction of Chronic Kidney Disease
https://iaeme.com/Home/journal/IJARET 3504 editor@iaeme.com
The formula used to calculate Global Best Position Gbest is
𝐺𝑏 = {min{𝑝𝑏,𝑗}, 𝑤ℎ𝑒𝑟𝑒 𝑗 ∈ [1, … . , 𝑚]𝑤ℎ𝑒𝑟𝑒 𝑚 > 1
Velocity can be updated by using the formula
𝑢𝑗
𝑗+1
= 𝑤𝑢𝑗(𝑡) + 𝑠1𝑖1[𝑦𝑗 (𝑡) − 𝑦𝑗 (𝑡)] + 𝑑2𝑖2[𝑔(𝑡) − 𝑦𝑗 (𝑡) ]
where 𝑢𝑖(𝑡) is the velocity and w, s1 and s2 are used supplied co-efficient. The i1 and i2 are
random values 𝑦𝑗(𝑡) is the individual best solution, g(t) is the swarm’s global best candidate
solution. 𝑤𝑢𝑗(𝑡) is known as inertia component. Inertia component value lies between 0.8 and
1.2. Lower the values of inertia component, it speeds up the convergence of swarm to optima.
But higher value encourages the exploration of entire search space. 𝑠1𝑖1[𝑦𝑗 (𝑡) − 𝑦𝑗 (𝑡)] is
known as cognitive component.
The following steps are done in PSO algorithm:
1. Initialize each particle in the population with random positions and velocities.
2. Repeat the following steps until stopping criterion is met.
i. for each particle
{
Calculate the fitness function value;
Compare the fitness value:
If it is superior to the best fitness value pbest, then current value is assigned pbest
value;
}
ii. Best fitness value particles among all the particles are selected and assign it as gbest;
iii. for each particle
{
Calculate particle velocity;
Change the position of the particle;
}
3.2. Classification Techniques
Classification is a very important data mining task, and the purpose of classification is to
propose a classification function or classification model (called classifier).The classification
model can map the data in the database to a specific class. Classification construction methods
include: Decision Tree, Naive Bayes, ANN, K- NN, Support Vector Machine, Rough set,
Logistic Regression, Genetic Algorithms (GAs) / Evolutionary Programming (EP).
Naïve Bayes Classification Technique
Naive Bayes is a strategy for assessing probabilities of individual variable qualities, given a
class, from preparing information and to then permit the utilization of these probabilities to
order new elements, which is a term in Bayesian insights managing a straightforward
probabilistic classifier taking into account applying Bayes' hypothesis (from Bayesian
measurements) with strong (guileless) autonomy assumptions. In basic terms, a strong Bayes
classifier expect that the nearness (or nonappearance) of a specific feature of a class is
5. B. Karthikeyan
https://iaeme.com/Home/journal/IJARET 3505 editor@iaeme.com
disconnected to the nearness (or nonattendance) of some other element. The Naive Bayesian
classifier, fills in as taking after inference:
Step 1: Let T be a training set of tuples and their related class names. Each tuple is spoken to
by a m-dimensional attribute vector, A = (a1, a2, … .. , am), m estimations made on the tuple
from m properties, individually, X1, X2, … , Xm.
Step 2: Suppose that there are n classes D1, D2, … . , and Dn. Given a tuple, A, the classifier
will anticipate that A has a place with the class having the most noteworthy back likelihood,
adapted on A. That is, the guileless Bayesian classifier predicts that tuple A has a place with
the class Tj if and just if
𝑃 ((𝐷𝑗|𝐴) > 𝑃 ((𝐷𝑘|𝐴) ) for 1 ≤ k ≤ n, k ≠ j (7)
The boost P(Dj|A). The class Dj for which P(Dk|A) is amplified is known as the most
extreme posterior hypothesis. By Bayes' hypothesis (Next condition)
𝑃(𝐷𝑗|𝐴) =
𝑃(𝐴|𝐷𝑗)𝑃(𝐷𝑗)
𝑃(𝐴)
(8)
Step 3: Since P(A) is consistent for all classes, just (P(Dj|A) = P(A |Dj)P(Dj)) should be
amplified.
Step 4: Based on the supposition is that properties are restrictively free (i.e., no reliance
connection between attributes), the registering of P(A|Dj) utilizing the accompanying
condition:
𝑃(𝐴|𝐷𝑗) = ∏ 𝑃(𝑎𝑖|𝐷𝑗)
𝑚
𝑖=1 (9)
Diminishes the calculation cost by Equation (P(Dj|A) = P(A |Dj)P(Dj), just numbers the
class appropriation. On the off chance that Xi is unmitigated, P(Ai|Dj) is the no. of tuples in Dj
having esteem Ai for Xi separated by |Dj, T| no. of tuples of Dj in T. Also, if Xi is persistent
esteemed, P(Ai|Dj) is typically processed in view of Gaussian circulation with a mean μ and
standard deviation σ and P(Ai|Dj) is:
g (x, μ, σ ) =
1
√2𝜋𝜎
𝑒
−
(𝑥−μ)2
2σ2
(10)
𝑃(𝐷𝑗|𝐴) = 𝑔 (𝑎𝑖, μ𝐷𝑗, σ𝐷𝑗) (11)
Where μ is the mean and σ is the difference. On the off chance that a property estimation
doesn't happen with each class esteem, the likelihood will be zero, and a posteriori likelihood
will likewise be zero.
K-Nearest Neighbour Classification Method
The k-nearest neighbors algorithm is one of the most used algorithms in machine learning. It is
a learning method bases on instances that does not required a learning phase. The training
sample, associated with a distance function and the choice function of the class based on the
classes of nearest neighbors is the model developed. Before classifying a new element, we must
compare it to other elements using a similarity measure. Its k-nearest neighbors are then
considered, the class that appears most among the neighbors is assigned to the element to be
classified. The neighbors are weighted by the distance that separate it to the new elements to
classify The proper functioning of the method depends on the choice of some number of
parameter such as the parameter k which represents the number of neighbors chosen to assign
the class to the new element, and the distance used.
6. Application of Data Mining Techniques for the Prediction of Chronic Kidney Disease
https://iaeme.com/Home/journal/IJARET 3506 editor@iaeme.com
Decision Tree Classification Method
The decision tree is a structure that includes root node, branch and leaf node. Each internal node
denotes a test on attribute, each branch denotes the outcome of test and each leaf node holds
the class label. The topmost node in the tree is the root node. The decision tree approach is more
powerful for classification problems. There are two steps in this techniques building a tree &
applying the tree to the dataset. There are many popular decision tree algorithms CART, ID3,
C4.5, CHAID, and J48.
Support Vector Machine
Support vector machine (SVM) is an algorithm that attempts to find a linear separator (hyper-
plane) between the data points of two classes in multidimensional space. SVMs are well suited
to dealing with interactions among features and redundant features.
Artificial Neural Network Classification Method
Artificial Neural Network (ANN): is a collection of neuron like processing units with weight
connections between the units. It maps a set of input data onto a set of appropriate output data.
It consists of 3 layers: input layer, hidden layer & output layer. There is connection between
each layer & weights are assigned to each connection. The primary function of neurons of input
layer is to divide input xi into neurons in hidden layer. Neuron of hidden layer adds input signal
xi with weights wji of respective connections from input layer. The output Yj is function of Yj
= f (Σ wji xi) Where f is a simple threshold function such as sigmoid or hyperbolic tangent
function
4. RESULT AND DISCUSSION
4.1. Dataset Description
The benchmark Chronic Kidney Disease dataset is considered for the classification of kidney
disease using data mining in this research work. Table 1 gives the detailed description of the
CKD dataset.
Table 1 Description of the Chronic Kidney Disease dataset
Sl.No Feature Name Description
1 Age Age -age in years
2 bp blood pressure - bp in mm/Hg
3 sg Specific gravity - sg -
(1.005,1.010,1.015,1.020,1.025)
4 al Albumin - al - (0,1,2,3,4,5)
5 su Sugar - su - (0,1,2,3,4,5)
6 rbc Red Blood cell - rbc - (normal,abnormal)
7 pc pus cell - pc - (normal,abnormal)
8 pcc pus cell clumps - pcc - (present,notpresent)
9 ba bacteria - ba - (present,notpresent)
10 bgr blood glucose random - ba - (present,notpresent)
11 bu blood urea –bu in mgs/dl
12 sc serum creatinine - sc in mgs/dl
13 sod sodium - sod in mEq/L
14 pot potassium - pot in mEq/L
15 hemo haemoglobin - hemo in gms
16 pcv packed cell volume
17 wc white blood cell count - wc in cells/cumm
18 rc red blood cell count - rc in millions/cmm
19 htn hyptertension - htn - (yes,no)
20 dm diabetes mellitus - dm - (yes,no)
7. B. Karthikeyan
https://iaeme.com/Home/journal/IJARET 3507 editor@iaeme.com
21 cad coronary artery disease - cad - (yes,no)
22 appet appetite - appet - (good,poor)
23 pe pedal edema - pe - (yes,no)
24 ane anemia - ane - (yes,no)
25 class class class - (ckd,notckd)
4.2. Number of Features obtained
Table 2 depicts the number of features obtained by using correlation-based feature selection
with PSO search method.
Table 2 Number of Features obtained by Feature Selection
Sl.No Feature Name
1 Blood Pressure
2 Specific Gravity
3 Albumin
4 Red Blood Cells
5 Pus Cell
6 Packed Cell Volume(numerical)
7 Hypertension
8 Diabetes Mellitus
9 Appetite
10 Pedal Edema
11 Anemia
4.3. Performance Metrics
Table 3 depicts the performance metrics considered for evaluating the feature selection
technique and classification techniques.
Table 3 List of Performance Metrics
Metrics Equation
Accuracy
𝑇𝑃 + 𝑇𝑁
𝑇𝑃 + 𝐹𝑁 + 𝑇𝑁 + 𝐹𝑃
True Positive Rate (TPR)
𝑇𝑃
𝑇𝑃 + 𝐹𝑁
False Positive Rate (FPR)
𝐹𝑃
𝐹𝑃 + 𝑇𝑁
Precision
𝑇𝑃
𝑇𝑃 + 𝐹𝑃
Recall
𝑇𝑃
𝑇𝑃 + 𝐹𝑁
F-Measure 2.
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛. 𝑅𝑒𝑐𝑎𝑙𝑙
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛+𝑅𝑒𝑐𝑎𝑙𝑙
Mean Absolute Error (MAE)
1
𝑁
∑|𝜃
̂𝑖 − 𝜃𝑖|
𝑁
𝐼=1
Root Mean Squared Error (RMSE) √
𝟏
𝑵
∑(𝜃
̂𝑖 − 𝜃𝑖)
𝟐
𝑵
𝒊=𝟏
Relative Absolute Error (RAE)
∑ |𝜃
̂𝑖 − 𝜃𝑖|
𝑁
𝐼=1
∑ |𝜃̅𝑖 − 𝜃𝑖|
𝑁
𝐼=1
Root Relative Squared Error (RRSE) √
∑ (𝜃
̂𝑖 − 𝜃𝑖)
𝟐
𝑵
𝒊=𝟏
∑ (𝜃̅𝑖 − 𝜃𝑖)𝟐
𝑵
𝒊=𝟏
8. Application of Data Mining Techniques for the Prediction of Chronic Kidney Disease
https://iaeme.com/Home/journal/IJARET 3508 editor@iaeme.com
Table 4 depicts the performance analysis of the Naïve Bayes classification technique for
original dataset and feature selection processed dataset. From the table 4, the pre-processed
dataset performs well in all aspect when it is compared with the original dataset using Naïve
Bayes classification method.
Table 4 Performance analysis of the Naïve Bayes classification techniques for original dataset and
pre-processed dataset
Performance
Metrics
Type of Dataset
Original Dataset Pre-Processed Dataset
Accuracy 78.2278 % 96.25 %
Kappa Statistic 0.5672 0.9274
MAE 0.0879 0.0296
RMSE 0.2691 0.145
RAE 45.8877 % 8.7118 %
RRSE 87.3506 % 35.2013 %
TPR 0.782 0.963
FPR 0.172 0.021
Precision 0.812 0.966
F-Measure 0.791 0.963
ROC Area 0.892 1.000
Table 5 depicts the performance analysis of the ANN classification technique for original
dataset and feature selection processed dataset. From the table 5, the pre-processed dataset
performs well in all aspect when it is compared with the original dataset using Artificial Neural
Network classification method.
Table 5 Performance analysis of the ANN classification techniques for original dataset and pre-
processed dataset
Performance Metrics Type of Dataset
Original Dataset Pre-Processed Dataset
Accuracy 70.8861 % 96.25 %
Kappa Statistic 0.3688 0.9278
MAE 0.1165 0.0248
RMSE 0.3209 0.157
RAE 60.3793 % 7.2977 %
RRSE 104.1767 % 38.1254 %
TPR 0.709 0.963
FPR 0.347 0.015
Precision 0.6885 0.969
F-Measure 0.684 0.964
ROC Area 0.693 0.978
Table 6 depicts the performance analysis of the Support Vector Machine classification
technique for original dataset and feature selection processed dataset. From the table 6, the pre-
processed dataset performs well in all aspect when it is compared with the original dataset using
Support Vector Machine classification method.
9. B. Karthikeyan
https://iaeme.com/Home/journal/IJARET 3509 editor@iaeme.com
Table 6 Performance analysis of the SVM classification techniques for original dataset and pre-
processed dataset
Performance Metrics Type of Dataset
Original Dataset Pre-Processed Dataset
Accuracy 82.0253 % 98 %
Kappa Statistic 0.625 0.9609
MAE 0.2397 0.2267
RMSE 0.3248 0.2802
RAE 124.2287 % 66.6271%
RRSE 105.4194 % 68.0246%
TPR 0.820 0.980
FPR 0.181 0.016
Precision 0.813 0.980
F-Measure 0.813 0.980
ROC Area 0.808 0.982
Table 7 depicts the performance analysis of the K-Nearest Neighbor classification technique
for original dataset and feature selection processed dataset. From the table 7, the pre-processed
dataset performs well in all aspect when it is compared with the original dataset using KNN
classification method.
Table 7 Performance analysis of the KNN classification techniques for original dataset and pre-
processed dataset
Performance Metrics Type of Dataset
Original Dataset Pre-Processed Dataset
Accuracy 74.4304 % 90.5 %
Kappa Statistic 0.4341 0.8067
MAE 0.1023 0.0633
RMSE 0.3198 0.2517
RAE 53.0184 % 18.6164 %
RRSE 103.813 % 61.0936 %
TPR 0.744 0.905
FPR 0.327 0.108
Precision 0.73 0.908
F-Measure 0.7165 0.892
ROC Area 0.705 0.898
Table 8 depicts the performance analysis of the J48 decision tree classification technique
for original dataset and feature selection processed dataset. From the table 8, the pre-processed
dataset performs well in all aspect when it is compared with the original dataset using J48
Decision tree classification method.
Table 8 Performance analysis of the J48 classification techniques for original dataset and pre-
processed dataset
Performance Metrics Type of Dataset
Original Dataset Pre-Processed Dataset
Accuracy 81.2658 % 96.25 %
Kappa Statistic 0.593 0.9275
MAE 0.1226 0.0459
RMSE 0.2478 0.1416
RAE 63.535 % 13.4814 %
RRSE 80.4331 % 34.3786 %
TPR 0.813 0.963
FPR 0.233 0.022
Precision 0.8045 0.966
F-Measure 0.798 0.963
ROC Area 0.785 0.985
10. Application of Data Mining Techniques for the Prediction of Chronic Kidney Disease
https://iaeme.com/Home/journal/IJARET 3510 editor@iaeme.com
Figure 2 depicts the graphical representation of the performance analysis on Accuracy in %
of the original dataset and pre-processed dataset using NB, ANN, KNN, SVM and J48
classification techniques. From the figure 2, it is clear that the reduced dataset gives more
accuracy than original dataset when using classification methods.
Figure 3 depicts the graphical representation of the performance analysis on Kappa
Statistics in % of the original dataset and pre-processed dataset using NB, ANN, KNN, SVM
and J48 classification techniques. From the figure 3, it is clear that the reduced dataset gives
more accuracy than original dataset when using classification methods.
Figure 4 depicts the graphical representation of the performance analysis on Mean Absolute
Error (MAE) of the original dataset and pre-processed dataset using NB, ANN, KNN, SVM
and J48 classification techniques. From the figure 4, it is clear that the reduced dataset gives
less Mean Absolute Error (MAE) than original dataset when using classification methods.
Figure 5 depicts the graphical representation of the performance analysis on Root Mean
Squared Error (RMSE) of the original dataset and pre-processed dataset using NB, ANN, KNN,
SVM and J48 classification techniques. From the figure 5, it is clear that the reduced dataset
gives less RMSE than original dataset when using classification methods.
Figure 6 depicts the graphical representation of the performance analysis on Relative
Absolute Error (RAE) in % of the original dataset and pre-processed dataset using NB, ANN,
KNN, SVM and J48 classification techniques. From the figure 6, it is clear that the reduced
dataset gives less RAE than original dataset when using classification methods.
Figure 7 depicts the graphical representation of the performance analysis on Root Relative
Squared Error (RRSE) in % of the original dataset and pre-processed dataset using NB, ANN,
KNN, SVM and J48 classification techniques. From the figure 7, it is clear that the reduced
dataset gives less RRSE than original dataset when using classification methods.
Figure 8 depicts the graphical representation of the performance analysis on True Positive
Rate (TPR) of the original dataset and pre-processed dataset using NB, ANN, KNN, SVM and
J48 classification techniques. From the figure 8, it is clear that the reduced dataset gives more
TPR than original dataset when using classification methods.
Figure 9 depicts the graphical representation of the performance analysis on False Positive
Rate (FPR) of the original dataset and pre-processed dataset using NB, ANN, KNN, SVM and
J48 classification techniques. From the figure 9, it is clear that the reduced dataset gives less
FPR than original dataset when using classification methods.
Figure 2 Performance Analysis on Accuracy in % of the original dataset and pre-processed dataset
using NB, ANN, KNN, SVM and J48 classification techniques
11. B. Karthikeyan
https://iaeme.com/Home/journal/IJARET 3511 editor@iaeme.com
Figure 10 depicts the graphical representation of the performance analysis on Precision of
the original dataset and pre-processed dataset using NB, ANN, KNN, SVM and J48
classification techniques. From the figure 10, it is clear that the reduced dataset gives more
precision than original dataset when using classification methods.
Figure 3 Performance analysis on the Kappa Statistic of the original dataset and pre-processed dataset
using NB, ANN, SVM, KNN and J48 classification techniques
Figure 4 Performance Analysis on the Mean Absolute Error of the original dataset and pre-processed
dataset using NB, ANN, SVM, KNN and J48 classification techniques
Figure 5 Performance Analysis on the Root Mean Squared Error of the original dataset and Pre-
processed dataset using NB, ANN, SVM, KNN and J48 classification techniques
12. Application of Data Mining Techniques for the Prediction of Chronic Kidney Disease
https://iaeme.com/Home/journal/IJARET 3512 editor@iaeme.com
Figure 6 Performance analysis on the Relative Absolute Error (RAE) in % of the original dataset and
pre-processed dataset using NB, ANN, SVM, KNN and J48 classification techniques
Figure 7 Performance analysis on the Root Relative Squared Error in % of the original dataset and
pre-processed dataset using NB, ANN, SVM, KNN and J48 classification techniques
Figure 8 Performance analysis on the True Positive Rate of the original dataset and pre-processed
dataset using NB, ANN, SVM, KNN and J48 classification techniques
13. B. Karthikeyan
https://iaeme.com/Home/journal/IJARET 3513 editor@iaeme.com
Figure 9 Performance analysis on the False Positive Rate of the original dataset and pre-processed
dataset using NB, ANN, SVM, KNN and J48 classification techniques
Figure 10 Performance Analysis on the Precision of the original dataset and pre-processed dataset
using NB, ANN, SVM, KNN and J48 classification techniques
5. CONCLUSION
In this study feature selection method and ensemble method have been used on CKD dataset to
improve the accuracy of the classifiers. For feature selection method Correlation based Feature
Selection method combine with Particle Swarm Optimization (PSO) search have been used.
These methods have been used both proposed feature selection method to improve the accuracy
of machine learning classifiers. The accuracy rate of KNN, J48, ANN, NB, and SVM classifier
on CKD dataset has been compared to its accuracy, on a reduced dataset which has been used
Correlation based Feature Selection method combine with Particle Swarm Optimization (PSO)
search. The experimental result shows that after reducing the dataset the accuracy of the
classifier has been improved.
In the future, we will integrate the real-time patient data combine with recorded dataset by
storing patient data the recorded containing will stored in database to improve chronic kidney
disease and other disease prediction system by using hybrid algorithm. In the future, we suggest
that including some visualization method like logical representation of the useful knowledge
base transformation into visualize to the user.
14. Application of Data Mining Techniques for the Prediction of Chronic Kidney Disease
https://iaeme.com/Home/journal/IJARET 3514 editor@iaeme.com
REFERENCES
[1] K. Chandel, V. Kunwar, S. Sabitha, T. Choudhury, and S. Mukherjee, A comparative study on
thyroid disease detection using K-nearest neighbor and Naive Bayes classification techniques,
CSI Trans. ICT, vol. 4, no. 24, pp. 313319, Dec. 2016.
[2] B. Boukenze, A. Haqiq, and H. Mousannif, Predicting Chronic Kidney Failure Disease Using
Data Mining Techniques, in Advances in Ubiquitous Networking 2, Springer, Singapore, 2017,
pp. 701712.
[3] T. Shaikhina, D. Lowe, S. Daga, D. Briggs, R. Higgins, and N. Khovanova, Decision tree and
random forest models for outcome prediction in antibody incompatible kidney transplantation,
Biomed. Signal Process. Control, Feb. 2017.
[4] R. Ani, G. Sasi, U.R. Sankar, & O.S. Deepa, (2016, September). Decision support system for
diagnosis and prediction of chronic renal failure using random subspace classification I EEE
Conference Publication.[Online]. Available:
http://ieeexplore.ieee.org/abstract/document/7732224/?reload=true. [Accessed: 15-Dec-2017].
[5] L.-C. Cheng, Y.-H. Hu, and S.-H. Chiou, Applying the Temporal Abstraction Technique to the
Prediction of Chronic Kidney Disease Progression, J. Med. Syst., vol. 41, no. 5, p. 85, May
2017.
[6] H. Polat, H. D. Mehr, and A. Cetin, Diagnosis of Chronic Kidney Disease Based on Support
Vector Machine by Feature Selection Methods, J. Med. Syst., vol. 41, no. 4, p. 55, Apr. 2017.
[7] P. Pangong and N. Iam-On, Predicting transitional interval of kidney disease stages 3 to 5 using
data mining method, in 2016 Second Asian Conference on Defence Technology (ACDT), 2016,
pp. 145150.
[8] K. R. A. Padmanaban and G. Parthiban, Applying Machine Learning Techniques for Predicting
the Risk of Chronic Kidney Disease, Indian J. Sci. Technol., vol. 9, no. 29, Aug. 2016.
[9] S. Perveen, M. Shahbaz, A. Guergachi, and K. Keshavjee, Performance Analysis of Data
Mining Classification Techniques to Predict Diabetes, Procedia Comput. Sci., vol. 82, no.
Supplement C, pp. 115121, Jan. 2016.
[10] U. N. Dulhare and M. Ayesha, Extraction of action rules for chronic kidney disease using Naive
Bayes classifier, in 2016 IEEE International Conference on Computational Intelligence and
Computing Research (ICCIC), 2016, pp. 15.
[11] N. Borisagar, D. Barad, and P. Raval, Chronic Kidney Disease Prediction Using Back
Propagation Neural Network Algorithm, in Proceedings of International Conference on
Communication and Networks, Springer, Singapore, 2017, pp. 295303.
[12] A. I. Pritom, M. A. R. Munshi, S. A. Sabab, and S. Shihab, Predicting breast cancer recurrence
using effective classification and feature selection technique, in 2016 19th International
Conference on Computer and Information Technology (ICCIT), 2016, pp. 310314.
[13] Lakshmi, K.R., Nagesh, Y., & VeeraKrishna, M, Performance Comparison of Three Data
Mining Techniques for Predicting Kidney Dialysis Survivability, International Journal of
Advances in Engineering & Technology, Mar., Vol.7, Issue 1, Page No: 242-254, 2013.
[14] Lambodar Jena, Narendra Ku. Kamila, Distributed Data Mining Classification Algorithms for
Prediction of Chronic Kidney Disease, International Journal of Emerging Research in
Management & Technology ISSN: 2278-9359(Vol 4, Issue-11), IJERMT, 2015.
[15] Vijayarani, S., & Dhayananad, M., S Kidney Disease Prediction using SVM and ANN
algorithms, International Journal of Computing and Business Research (IJCBR), Volume 6
Issue 2 March 2015.
15. B. Karthikeyan
https://iaeme.com/Home/journal/IJARET 3515 editor@iaeme.com
[16] Boukenze, Basma, Abdelkrim Haqiq, and Hajar Mousannif. "Predicting Chronic Kidney Failure
Disease Using Data Mining Techniques." Advances in Ubiquitous Networking 2. Springer,
Singapore, 2017. 701-712.
[17] Yadollahpour, Ali, et al. "Designing and implementing an ANFIS based medical decision
support system to predict chronic kidney disease progression." Frontiers in physiology 9 (2018).
[18] Sedighi, Zeinab, Hossein Ebrahimpour-Komleh, and Seyed Jalaleddin Mousavirad. "Featue
selection effects on kidney desease analysis." 2015 International Congress on Technology,
Communication and Knowledge (ICTCK). IEEE, 2015.
[19] Subhashini, M., & Gopinath, R., Mapreduce Methodology for Elliptical Curve Discrete
Logarithmic Problems – Securing Telecom Networks, International Journal of Electrical
Engineering and Technology, 11(9), 261-273 (2020).
[20] Upendran, V., & Gopinath, R., Feature Selection based on Multicriteria Decision Making for
Intrusion Detection System, International Journal of Electrical Engineering and Technology,
11(5), 217-226 (2020).
[21] Upendran, V., & Gopinath, R., Optimization based Classification Technique for Intrusion
Detection System, International Journal of Advanced Research in Engineering and Technology,
11(9), 1255-1262 (2020).
[22] Subhashini, M., & Gopinath, R., Employee Attrition Prediction in Industry using Machine
Learning Techniques, International Journal of Advanced Research in Engineering and
Technology, 11(12), 3329-3341 (2020).
[23] Rethinavalli, S., & Gopinath, R., Classification Approach based Sybil Node Detection in Mobile
Ad Hoc Networks, International Journal of Advanced Research in Engineering and Technology,
11(12), 3348-3356 (2020).
[24] Rethinavalli, S., & Gopinath, R., Botnet Attack Detection in Internet of Things using
Optimization Techniques, International Journal of Electrical Engineering and Technology,
11(10), 412-420 (2020).
[25] Priyadharshini, D., Poornappriya, T.S., & Gopinath, R., A fuzzy MCDM approach for
measuring the business impact of employee selection, International Journal of Management
(IJM), 11(7), 1769-1775 (2020).
[26] Poornappriya, T.S., Gopinath, R., Application of Machine Learning Techniques for Improving
Learning Disabilities, International Journal of Electrical Engineering and Technology (IJEET),
11(10), 392-402 (2020).