SlideShare a Scribd company logo
1 of 22
Prediction of Early-Stage Chronic Kidney Disease Using
Machine Learning with Advanced Feature Selection
Motivation / Background
• Chronic Kidney Disease (CKD) is one of the most deadly
non- communicable diseases globally.
• In 2017, the total number of worldwide deaths due to
CKD was 1.2 million which rose to 41.5% from 1990.
• The global burden of CKD is increasing, and is projected
to become the 5th most common cause of years of life
lost globally by 2040.
• In CKD, kidney unable to perform essential functions in
the body which leads to other critical diseases such as
heart disease, high blood pressure, diabetes etc.
• Therefore, early-stage detection of CKD is essential for
containing the further progression of the disease, which in
turn reduce the mortality rate and treatment cost
significantly.
Literature Review
In one of the study, researchers have proposed Density based feature selection with Ant Colony
Optimization (D-ACO) for detection of CKD which resulted in higher accuracy of 95% and
sensitivity of 96%.
In a study, researchers have proposed bio-inspired Fruit Fly Optimization (FFO) based feature
selection technique with Multi Kernel Support Vector Machine Classifier (MKSVM) resulted in
accuracy of 98.5% and sensitivity of 97.6%.
In another study, the researchers have employed Neuro-Fuzzy algorithm for detection of CKD
which resulted in higher accuracy of 97%.
In a research, the researchers have used Deep learning based stacked auto-encoder based
feature selection approach which resulted in 100% Accuracy and 100% Sensitivity.
Problem Statement
In the previous studies, researchers have employed very limited feature selection
techniques such as filter and wrapper.
In the existing literature, the researchers have used only mean and mode based imputation
for handling missing values.
No studies in the past, have employed model interpretation technique for explaining their
black-box model.
None of the techniques except D-ACO have employed meta-heuristic based advanced
feature selection for selecting most optimal feature set in early-stage CKD prediction.
Methodology
Architecture of Machine Learning Pipeline
Methodology
Block Diagram of Proposed Feature Selection Method
Methodology
• Grey Wolf Optimization is a meta-heuristic algorithm introduced by
S. Mirjalili, S.M. Mirjalili, and A.Lewis in 2014
• It is based on the leadership hierarchy and hunting pattern of grey
wolves in nature.
• In terms of leadership hierarchy, ⍺ is the leader and decision maker.
• β and 𝜹 assist ⍺ in decision making.
• The rest of the wolves are Ω which serves as a scapegoat.
• The main steps of Grey Wolf Hunting are :
1. Searching for Prey
2. Tracking, Chasing, and Approaching Prey
3. Pursuing, Encircling and Harassing the Prey until it stop moving.
4. Attacking the Prey
Methodology
Initialize parameters (number of grey wolves, no. of iterations etc.)
Create initial population of grey wolves with different social
hierarchy (α,β,δ and ω)
Estimate the position of prey by α,β, and δ
Evaluate the position of grey wolves by the position of the prey
Grade the grey wolves
End
Stopping
criteria satisfied
Start
Yes
No
Flow Chart of
Grey Wolf Optimization Algorithm
Methodology
• SHAP was first introduced by (Lundberg and Lee, 2017).
• The main objective of SHAP is to explain the model’s prediction by showing the contribution of each feature
in the prediction.
• It works on the principle of game theory where success of a team is determined by the contribution of each
player in the game.
• It calculates what the prediction of the model would be with and without a feature
Model Interpretation : SHapley Additive exPlanations (SHAP)
Methodology
• PDP were proposed by Friedman in (Friedman, 2001).
• The partial dependence plot is a model agnostic tool which plots the change in average predicted
value of a feature over their marginal distribution in the dataset .
• The PDP plots gives the overall picture of a feature contribution in the prediction i.e. how the
prediction value changes with the change in the value of a feature.
Model Interpretation : Partial Dependence Plot (PDP)
Results and Discussion
Model Without Feature Selection With Feature Selection
Accuracy (%) Accuracy (%)
Random
Forest
97.18 98.43
Adaboost 99.37 99.37
XGBoost 96.87 96.87
Comparison of averaged 10-fold cross validation accuracy with and without feature selection
Results and Discussion
Comparison of results on test set with and without feature selection
Criterion ML Algorithms Accuracy
(%)
Precision
(%)
Sensitivity
(%)
Specificity
(%)
F1-score
(%)
MCC
(%)
Without
Feature
Selection
Random
Forest
95.00 97.91 94.00 96.67 95.91 0.8959
Adaboost 97.50 100.00 96.00 100.00 97.95 0.9486
XGBoost 96.25 100.00 94.00 100.00 96.90 0.9244
With Feature
Selection
Random
Forest
98.75 98.03 100.00 96.67 99.00 0.9735
Adaboost 96.25 97.95 96.00 96.67 96.96 0.9208
XGBoost 96.25 94.33 100.00 90.00 97.08 0.9214
Results and Discussion
Comparison of results with recent state-of-the-art methods
Methodology # Features Accuracy Sensitivity Specificity F1-score
FFO
(Jerlin Rubini and Perumal, 2020)
11 98.50 97.60 100.00 -
Improved SVM-Radial with
Chi-Square
(Harimoorthy and Thangavelu,
2020)
11 98.30 100.00 97.60 -
Random Forest with Chi-Square
Feature Selection (Yashfi et al.,
2020)
20 97.12 97.00 - 97.00
Deep stacked auto encoder
(Khamparia et al., 2019)
10 100.00 100.00 100.00 100.00
D-ACO
(Elhoseny et al., 2019)
14 95.00 96.00 93.33 96.00
Naïve Bayes with Best First
Algorithm (Arulanthu and
Perumal, 2019)
5 80.65 71.00 85.93 -
Proposed MV-GWO 5 98.75 100.00 96.67 99.00
Results and Discussion
Results and Discussion
Comparison of time complexity with and without feature selection
Model Interpretation
SHAP Feature Importance Plot
Model Interpretation
SHAP Summary Plot
Model Interpretation
SHAP Dependence Plot
Model Interpretation
Partial Dependence Plot
Model Interpretation
Partial Dependence Plot
Model Interpretation
Partial Dependence Plot
Conclusion and Future Work
The proposed MV-GWO method selected 5 critical features i.e., packed cell volume,
diabetes mellitus, red blood cells, blood urea and pus cells which resulted in higher
performance using Random Forest model.
The proposed feature selection method with Random Forest shown promising results
comparable to state-of-the-art feature selection methods in literature.
The results showed that the time complexity of proposed method with Random Forest was
significantly lesser than other algorithms with all features.
In the study, we have also used SHAP and PDP plots to analyse the effect of top 5 critical
features and explained how these features are contributing to prediction of CKD.
In future, we will extend our method for detection of other chronic and most critical diseases
such as cardiovascular disease, lung disease, liver disease, breast cancer, cervical cancer
etc.

More Related Content

What's hot

Machine Learning in Healthcare Diagnostics
Machine Learning in Healthcare DiagnosticsMachine Learning in Healthcare Diagnostics
Machine Learning in Healthcare DiagnosticsLarry Smarr
 
Statistics and Data Mining
Statistics and  Data MiningStatistics and  Data Mining
Statistics and Data MiningR A Akerkar
 
Feature Selection in Machine Learning
Feature Selection in Machine LearningFeature Selection in Machine Learning
Feature Selection in Machine LearningUpekha Vandebona
 
Flow & Error Control
Flow & Error ControlFlow & Error Control
Flow & Error Controltameemyousaf
 
Information retrieval-systems notes
Information retrieval-systems notesInformation retrieval-systems notes
Information retrieval-systems notesBAIRAVI T
 
Heart disease prediction system
Heart disease prediction systemHeart disease prediction system
Heart disease prediction systemSWAMI06
 
A Machine Learning Methodology for Diagnosing Chronic Kidney Disease (6).pptx
A Machine Learning Methodology for Diagnosing Chronic Kidney Disease (6).pptxA Machine Learning Methodology for Diagnosing Chronic Kidney Disease (6).pptx
A Machine Learning Methodology for Diagnosing Chronic Kidney Disease (6).pptxTeslarZone
 
Disease Prediction And Doctor Appointment system
Disease Prediction And Doctor Appointment  systemDisease Prediction And Doctor Appointment  system
Disease Prediction And Doctor Appointment systemKOYELMAJUMDAR1
 
Deep Learning: Application & Opportunity
Deep Learning: Application & OpportunityDeep Learning: Application & Opportunity
Deep Learning: Application & OpportunityiTrain
 
CS9222 Advanced Operating System
CS9222 Advanced Operating SystemCS9222 Advanced Operating System
CS9222 Advanced Operating SystemKathirvel Ayyaswamy
 
Heart Attack Prediction using Machine Learning
Heart Attack Prediction using Machine LearningHeart Attack Prediction using Machine Learning
Heart Attack Prediction using Machine Learningmohdshoaibuddin1
 
Advanced topics in artificial neural networks
Advanced topics in artificial neural networksAdvanced topics in artificial neural networks
Advanced topics in artificial neural networksswapnac12
 
Heart disease prediction
Heart disease predictionHeart disease prediction
Heart disease predictionAriful Haque
 
Breast cancer diagnosis machine learning ppt
Breast cancer diagnosis machine learning pptBreast cancer diagnosis machine learning ppt
Breast cancer diagnosis machine learning pptAnkitGupta1476
 
Machine Learning and Real-World Applications
Machine Learning and Real-World ApplicationsMachine Learning and Real-World Applications
Machine Learning and Real-World ApplicationsMachinePulse
 
Applications of paralleL processing
Applications of paralleL processingApplications of paralleL processing
Applications of paralleL processingPage Maker
 

What's hot (20)

Machine Learning in Healthcare Diagnostics
Machine Learning in Healthcare DiagnosticsMachine Learning in Healthcare Diagnostics
Machine Learning in Healthcare Diagnostics
 
Statistics and Data Mining
Statistics and  Data MiningStatistics and  Data Mining
Statistics and Data Mining
 
Feature Selection in Machine Learning
Feature Selection in Machine LearningFeature Selection in Machine Learning
Feature Selection in Machine Learning
 
Flow & Error Control
Flow & Error ControlFlow & Error Control
Flow & Error Control
 
Transport layer
Transport layerTransport layer
Transport layer
 
Information retrieval-systems notes
Information retrieval-systems notesInformation retrieval-systems notes
Information retrieval-systems notes
 
Heart disease prediction system
Heart disease prediction systemHeart disease prediction system
Heart disease prediction system
 
Firewalls
FirewallsFirewalls
Firewalls
 
A Machine Learning Methodology for Diagnosing Chronic Kidney Disease (6).pptx
A Machine Learning Methodology for Diagnosing Chronic Kidney Disease (6).pptxA Machine Learning Methodology for Diagnosing Chronic Kidney Disease (6).pptx
A Machine Learning Methodology for Diagnosing Chronic Kidney Disease (6).pptx
 
Disease Prediction And Doctor Appointment system
Disease Prediction And Doctor Appointment  systemDisease Prediction And Doctor Appointment  system
Disease Prediction And Doctor Appointment system
 
Deep Learning: Application & Opportunity
Deep Learning: Application & OpportunityDeep Learning: Application & Opportunity
Deep Learning: Application & Opportunity
 
Dbms schemas for decision support
Dbms schemas for decision supportDbms schemas for decision support
Dbms schemas for decision support
 
CS9222 Advanced Operating System
CS9222 Advanced Operating SystemCS9222 Advanced Operating System
CS9222 Advanced Operating System
 
2. visualization in data mining
2. visualization in data mining2. visualization in data mining
2. visualization in data mining
 
Heart Attack Prediction using Machine Learning
Heart Attack Prediction using Machine LearningHeart Attack Prediction using Machine Learning
Heart Attack Prediction using Machine Learning
 
Advanced topics in artificial neural networks
Advanced topics in artificial neural networksAdvanced topics in artificial neural networks
Advanced topics in artificial neural networks
 
Heart disease prediction
Heart disease predictionHeart disease prediction
Heart disease prediction
 
Breast cancer diagnosis machine learning ppt
Breast cancer diagnosis machine learning pptBreast cancer diagnosis machine learning ppt
Breast cancer diagnosis machine learning ppt
 
Machine Learning and Real-World Applications
Machine Learning and Real-World ApplicationsMachine Learning and Real-World Applications
Machine Learning and Real-World Applications
 
Applications of paralleL processing
Applications of paralleL processingApplications of paralleL processing
Applications of paralleL processing
 

Similar to Machine Learning Predicts Early CKD Using Advanced Feature Selection

CSCI 6505 Machine Learning Project
CSCI 6505 Machine Learning ProjectCSCI 6505 Machine Learning Project
CSCI 6505 Machine Learning Projectbutest
 
AMCIS-2020-Slide-Template_ERF.pptx
AMCIS-2020-Slide-Template_ERF.pptxAMCIS-2020-Slide-Template_ERF.pptx
AMCIS-2020-Slide-Template_ERF.pptxAyushTeli3
 
Feature Selection Approach based on Firefly Algorithm and Chi-square
Feature Selection Approach based on Firefly Algorithm and Chi-square Feature Selection Approach based on Firefly Algorithm and Chi-square
Feature Selection Approach based on Firefly Algorithm and Chi-square IJECEIAES
 
Systematic review and meta analaysis course - part 2
Systematic review and meta analaysis course - part 2Systematic review and meta analaysis course - part 2
Systematic review and meta analaysis course - part 2Ahmed Negida
 
Artificial Intelligence in pathology
Artificial Intelligence in pathologyArtificial Intelligence in pathology
Artificial Intelligence in pathologynehaSingh1543
 
Predictive Analysis of Breast Cancer Detection using Classification Algorithm
Predictive Analysis of Breast Cancer Detection using Classification AlgorithmPredictive Analysis of Breast Cancer Detection using Classification Algorithm
Predictive Analysis of Breast Cancer Detection using Classification AlgorithmSushanti Acharya
 
A MODIFIED BINARY PSO BASED FEATURE SELECTION FOR AUTOMATIC LESION DETECTION ...
A MODIFIED BINARY PSO BASED FEATURE SELECTION FOR AUTOMATIC LESION DETECTION ...A MODIFIED BINARY PSO BASED FEATURE SELECTION FOR AUTOMATIC LESION DETECTION ...
A MODIFIED BINARY PSO BASED FEATURE SELECTION FOR AUTOMATIC LESION DETECTION ...AIRCC Publishing Corporation
 
A MODIFIED BINARY PSO BASED FEATURE SELECTION FOR AUTOMATIC LESION DETECTION ...
A MODIFIED BINARY PSO BASED FEATURE SELECTION FOR AUTOMATIC LESION DETECTION ...A MODIFIED BINARY PSO BASED FEATURE SELECTION FOR AUTOMATIC LESION DETECTION ...
A MODIFIED BINARY PSO BASED FEATURE SELECTION FOR AUTOMATIC LESION DETECTION ...ijcsit
 
A CLASSIFICATION MODEL ON TUMOR CANCER DISEASE BASED MUTUAL INFORMATION AND F...
A CLASSIFICATION MODEL ON TUMOR CANCER DISEASE BASED MUTUAL INFORMATION AND F...A CLASSIFICATION MODEL ON TUMOR CANCER DISEASE BASED MUTUAL INFORMATION AND F...
A CLASSIFICATION MODEL ON TUMOR CANCER DISEASE BASED MUTUAL INFORMATION AND F...Kiogyf
 
A Novel Approach for Cancer Detection in MRI Mammogram Using Decision Tree In...
A Novel Approach for Cancer Detection in MRI Mammogram Using Decision Tree In...A Novel Approach for Cancer Detection in MRI Mammogram Using Decision Tree In...
A Novel Approach for Cancer Detection in MRI Mammogram Using Decision Tree In...CSCJournals
 
An innovative approach for feature selection based on chicken swarm optimization
An innovative approach for feature selection based on chicken swarm optimizationAn innovative approach for feature selection based on chicken swarm optimization
An innovative approach for feature selection based on chicken swarm optimizationAboul Ella Hassanien
 
AN EFFICIENT PSO BASED ENSEMBLE CLASSIFICATION MODEL ON HIGH DIMENSIONAL DATA...
AN EFFICIENT PSO BASED ENSEMBLE CLASSIFICATION MODEL ON HIGH DIMENSIONAL DATA...AN EFFICIENT PSO BASED ENSEMBLE CLASSIFICATION MODEL ON HIGH DIMENSIONAL DATA...
AN EFFICIENT PSO BASED ENSEMBLE CLASSIFICATION MODEL ON HIGH DIMENSIONAL DATA...ijsc
 
An Efficient PSO Based Ensemble Classification Model on High Dimensional Data...
An Efficient PSO Based Ensemble Classification Model on High Dimensional Data...An Efficient PSO Based Ensemble Classification Model on High Dimensional Data...
An Efficient PSO Based Ensemble Classification Model on High Dimensional Data...ijsc
 
Diagnosis of Cancer using Fuzzy Rough Set Theory
Diagnosis of Cancer using Fuzzy Rough Set TheoryDiagnosis of Cancer using Fuzzy Rough Set Theory
Diagnosis of Cancer using Fuzzy Rough Set TheoryIRJET Journal
 
A Review on the Brain Tumor Detection and Segmentation Techniques
A Review on the Brain Tumor Detection and Segmentation TechniquesA Review on the Brain Tumor Detection and Segmentation Techniques
A Review on the Brain Tumor Detection and Segmentation TechniquesIRJET Journal
 
A Threshold fuzzy entropy based feature selection method applied in various b...
A Threshold fuzzy entropy based feature selection method applied in various b...A Threshold fuzzy entropy based feature selection method applied in various b...
A Threshold fuzzy entropy based feature selection method applied in various b...IJMER
 

Similar to Machine Learning Predicts Early CKD Using Advanced Feature Selection (20)

A hybrid wrapper spider monkey optimization-simulated annealing model for opt...
A hybrid wrapper spider monkey optimization-simulated annealing model for opt...A hybrid wrapper spider monkey optimization-simulated annealing model for opt...
A hybrid wrapper spider monkey optimization-simulated annealing model for opt...
 
CSCI 6505 Machine Learning Project
CSCI 6505 Machine Learning ProjectCSCI 6505 Machine Learning Project
CSCI 6505 Machine Learning Project
 
AMCIS-2020-Slide-Template_ERF.pptx
AMCIS-2020-Slide-Template_ERF.pptxAMCIS-2020-Slide-Template_ERF.pptx
AMCIS-2020-Slide-Template_ERF.pptx
 
Feature Selection Approach based on Firefly Algorithm and Chi-square
Feature Selection Approach based on Firefly Algorithm and Chi-square Feature Selection Approach based on Firefly Algorithm and Chi-square
Feature Selection Approach based on Firefly Algorithm and Chi-square
 
Systematic review and meta analaysis course - part 2
Systematic review and meta analaysis course - part 2Systematic review and meta analaysis course - part 2
Systematic review and meta analaysis course - part 2
 
Kk341721880
Kk341721880Kk341721880
Kk341721880
 
Artificial Intelligence in pathology
Artificial Intelligence in pathologyArtificial Intelligence in pathology
Artificial Intelligence in pathology
 
Predictive Analysis of Breast Cancer Detection using Classification Algorithm
Predictive Analysis of Breast Cancer Detection using Classification AlgorithmPredictive Analysis of Breast Cancer Detection using Classification Algorithm
Predictive Analysis of Breast Cancer Detection using Classification Algorithm
 
A MODIFIED BINARY PSO BASED FEATURE SELECTION FOR AUTOMATIC LESION DETECTION ...
A MODIFIED BINARY PSO BASED FEATURE SELECTION FOR AUTOMATIC LESION DETECTION ...A MODIFIED BINARY PSO BASED FEATURE SELECTION FOR AUTOMATIC LESION DETECTION ...
A MODIFIED BINARY PSO BASED FEATURE SELECTION FOR AUTOMATIC LESION DETECTION ...
 
A MODIFIED BINARY PSO BASED FEATURE SELECTION FOR AUTOMATIC LESION DETECTION ...
A MODIFIED BINARY PSO BASED FEATURE SELECTION FOR AUTOMATIC LESION DETECTION ...A MODIFIED BINARY PSO BASED FEATURE SELECTION FOR AUTOMATIC LESION DETECTION ...
A MODIFIED BINARY PSO BASED FEATURE SELECTION FOR AUTOMATIC LESION DETECTION ...
 
A CLASSIFICATION MODEL ON TUMOR CANCER DISEASE BASED MUTUAL INFORMATION AND F...
A CLASSIFICATION MODEL ON TUMOR CANCER DISEASE BASED MUTUAL INFORMATION AND F...A CLASSIFICATION MODEL ON TUMOR CANCER DISEASE BASED MUTUAL INFORMATION AND F...
A CLASSIFICATION MODEL ON TUMOR CANCER DISEASE BASED MUTUAL INFORMATION AND F...
 
Comparison of breast cancer classification models on Wisconsin dataset
Comparison of breast cancer classification models on Wisconsin  datasetComparison of breast cancer classification models on Wisconsin  dataset
Comparison of breast cancer classification models on Wisconsin dataset
 
A Novel Approach for Cancer Detection in MRI Mammogram Using Decision Tree In...
A Novel Approach for Cancer Detection in MRI Mammogram Using Decision Tree In...A Novel Approach for Cancer Detection in MRI Mammogram Using Decision Tree In...
A Novel Approach for Cancer Detection in MRI Mammogram Using Decision Tree In...
 
An innovative approach for feature selection based on chicken swarm optimization
An innovative approach for feature selection based on chicken swarm optimizationAn innovative approach for feature selection based on chicken swarm optimization
An innovative approach for feature selection based on chicken swarm optimization
 
AN EFFICIENT PSO BASED ENSEMBLE CLASSIFICATION MODEL ON HIGH DIMENSIONAL DATA...
AN EFFICIENT PSO BASED ENSEMBLE CLASSIFICATION MODEL ON HIGH DIMENSIONAL DATA...AN EFFICIENT PSO BASED ENSEMBLE CLASSIFICATION MODEL ON HIGH DIMENSIONAL DATA...
AN EFFICIENT PSO BASED ENSEMBLE CLASSIFICATION MODEL ON HIGH DIMENSIONAL DATA...
 
An Efficient PSO Based Ensemble Classification Model on High Dimensional Data...
An Efficient PSO Based Ensemble Classification Model on High Dimensional Data...An Efficient PSO Based Ensemble Classification Model on High Dimensional Data...
An Efficient PSO Based Ensemble Classification Model on High Dimensional Data...
 
Diagnosis of Cancer using Fuzzy Rough Set Theory
Diagnosis of Cancer using Fuzzy Rough Set TheoryDiagnosis of Cancer using Fuzzy Rough Set Theory
Diagnosis of Cancer using Fuzzy Rough Set Theory
 
A Review on the Brain Tumor Detection and Segmentation Techniques
A Review on the Brain Tumor Detection and Segmentation TechniquesA Review on the Brain Tumor Detection and Segmentation Techniques
A Review on the Brain Tumor Detection and Segmentation Techniques
 
A Threshold fuzzy entropy based feature selection method applied in various b...
A Threshold fuzzy entropy based feature selection method applied in various b...A Threshold fuzzy entropy based feature selection method applied in various b...
A Threshold fuzzy entropy based feature selection method applied in various b...
 
Updated proposal powerpoint.pptx
Updated proposal powerpoint.pptxUpdated proposal powerpoint.pptx
Updated proposal powerpoint.pptx
 

Recently uploaded

How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 

Recently uploaded (20)

How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 

Machine Learning Predicts Early CKD Using Advanced Feature Selection

  • 1. Prediction of Early-Stage Chronic Kidney Disease Using Machine Learning with Advanced Feature Selection
  • 2. Motivation / Background • Chronic Kidney Disease (CKD) is one of the most deadly non- communicable diseases globally. • In 2017, the total number of worldwide deaths due to CKD was 1.2 million which rose to 41.5% from 1990. • The global burden of CKD is increasing, and is projected to become the 5th most common cause of years of life lost globally by 2040. • In CKD, kidney unable to perform essential functions in the body which leads to other critical diseases such as heart disease, high blood pressure, diabetes etc. • Therefore, early-stage detection of CKD is essential for containing the further progression of the disease, which in turn reduce the mortality rate and treatment cost significantly.
  • 3. Literature Review In one of the study, researchers have proposed Density based feature selection with Ant Colony Optimization (D-ACO) for detection of CKD which resulted in higher accuracy of 95% and sensitivity of 96%. In a study, researchers have proposed bio-inspired Fruit Fly Optimization (FFO) based feature selection technique with Multi Kernel Support Vector Machine Classifier (MKSVM) resulted in accuracy of 98.5% and sensitivity of 97.6%. In another study, the researchers have employed Neuro-Fuzzy algorithm for detection of CKD which resulted in higher accuracy of 97%. In a research, the researchers have used Deep learning based stacked auto-encoder based feature selection approach which resulted in 100% Accuracy and 100% Sensitivity.
  • 4. Problem Statement In the previous studies, researchers have employed very limited feature selection techniques such as filter and wrapper. In the existing literature, the researchers have used only mean and mode based imputation for handling missing values. No studies in the past, have employed model interpretation technique for explaining their black-box model. None of the techniques except D-ACO have employed meta-heuristic based advanced feature selection for selecting most optimal feature set in early-stage CKD prediction.
  • 6. Methodology Block Diagram of Proposed Feature Selection Method
  • 7. Methodology • Grey Wolf Optimization is a meta-heuristic algorithm introduced by S. Mirjalili, S.M. Mirjalili, and A.Lewis in 2014 • It is based on the leadership hierarchy and hunting pattern of grey wolves in nature. • In terms of leadership hierarchy, ⍺ is the leader and decision maker. • β and 𝜹 assist ⍺ in decision making. • The rest of the wolves are Ω which serves as a scapegoat. • The main steps of Grey Wolf Hunting are : 1. Searching for Prey 2. Tracking, Chasing, and Approaching Prey 3. Pursuing, Encircling and Harassing the Prey until it stop moving. 4. Attacking the Prey
  • 8. Methodology Initialize parameters (number of grey wolves, no. of iterations etc.) Create initial population of grey wolves with different social hierarchy (α,β,δ and ω) Estimate the position of prey by α,β, and δ Evaluate the position of grey wolves by the position of the prey Grade the grey wolves End Stopping criteria satisfied Start Yes No Flow Chart of Grey Wolf Optimization Algorithm
  • 9. Methodology • SHAP was first introduced by (Lundberg and Lee, 2017). • The main objective of SHAP is to explain the model’s prediction by showing the contribution of each feature in the prediction. • It works on the principle of game theory where success of a team is determined by the contribution of each player in the game. • It calculates what the prediction of the model would be with and without a feature Model Interpretation : SHapley Additive exPlanations (SHAP)
  • 10. Methodology • PDP were proposed by Friedman in (Friedman, 2001). • The partial dependence plot is a model agnostic tool which plots the change in average predicted value of a feature over their marginal distribution in the dataset . • The PDP plots gives the overall picture of a feature contribution in the prediction i.e. how the prediction value changes with the change in the value of a feature. Model Interpretation : Partial Dependence Plot (PDP)
  • 11. Results and Discussion Model Without Feature Selection With Feature Selection Accuracy (%) Accuracy (%) Random Forest 97.18 98.43 Adaboost 99.37 99.37 XGBoost 96.87 96.87 Comparison of averaged 10-fold cross validation accuracy with and without feature selection
  • 12. Results and Discussion Comparison of results on test set with and without feature selection Criterion ML Algorithms Accuracy (%) Precision (%) Sensitivity (%) Specificity (%) F1-score (%) MCC (%) Without Feature Selection Random Forest 95.00 97.91 94.00 96.67 95.91 0.8959 Adaboost 97.50 100.00 96.00 100.00 97.95 0.9486 XGBoost 96.25 100.00 94.00 100.00 96.90 0.9244 With Feature Selection Random Forest 98.75 98.03 100.00 96.67 99.00 0.9735 Adaboost 96.25 97.95 96.00 96.67 96.96 0.9208 XGBoost 96.25 94.33 100.00 90.00 97.08 0.9214
  • 13. Results and Discussion Comparison of results with recent state-of-the-art methods Methodology # Features Accuracy Sensitivity Specificity F1-score FFO (Jerlin Rubini and Perumal, 2020) 11 98.50 97.60 100.00 - Improved SVM-Radial with Chi-Square (Harimoorthy and Thangavelu, 2020) 11 98.30 100.00 97.60 - Random Forest with Chi-Square Feature Selection (Yashfi et al., 2020) 20 97.12 97.00 - 97.00 Deep stacked auto encoder (Khamparia et al., 2019) 10 100.00 100.00 100.00 100.00 D-ACO (Elhoseny et al., 2019) 14 95.00 96.00 93.33 96.00 Naïve Bayes with Best First Algorithm (Arulanthu and Perumal, 2019) 5 80.65 71.00 85.93 - Proposed MV-GWO 5 98.75 100.00 96.67 99.00
  • 15. Results and Discussion Comparison of time complexity with and without feature selection
  • 22. Conclusion and Future Work The proposed MV-GWO method selected 5 critical features i.e., packed cell volume, diabetes mellitus, red blood cells, blood urea and pus cells which resulted in higher performance using Random Forest model. The proposed feature selection method with Random Forest shown promising results comparable to state-of-the-art feature selection methods in literature. The results showed that the time complexity of proposed method with Random Forest was significantly lesser than other algorithms with all features. In the study, we have also used SHAP and PDP plots to analyse the effect of top 5 critical features and explained how these features are contributing to prediction of CKD. In future, we will extend our method for detection of other chronic and most critical diseases such as cardiovascular disease, lung disease, liver disease, breast cancer, cervical cancer etc.