SlideShare a Scribd company logo
1 of 5
Download to read offline
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 08 | Aug 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1878
A Machine Learning Model for Diabetes
Josita Sengupta1, Koushik Pal2, Mallika Roy3, Jishnu Nath Paul4
Department of Electronics& Communication Engineering, Guru Nanak Institute of Technology, Kolkata, India
----------------------------------------------------------------------------***----------------------------------------------------------------------------
Abstract - The prediction of diseases benefits greatly from
machine learning methods. Based on the numerous symptoms
that users enter as input to the system, the Ailment Prediction
system uses predictive modeling to anticipate the user's
disease. The algorithm analyses the user's vividsymptomsand
returns the likelihood that the disease will occur as an output.
Different Supervised Machine Learning methods are used for
disease prediction. Accurate medical data analysis enhances
patient care and early illness identification as big data usage
grows in the biomedical and healthcare industries. This
project's overarching goal is to help academics and
professionals choose an appropriate machine learning
algorithm for the healthcare industry. This study aims to give
essential knowledge about the supervised machine learning
algorithms utilized in the healthcare industry. As a result of
compiling a data table on the efficacy of learning algorithms
for various diseases, wewere abletocomparewhichalgorithm
worked best for which specific type of condition. These
publications will aid researchers and practitioners in
understanding the role supervised machine learning
algorithms have in the healthcareindustryandtheaccuracyof
various supervised machine learning algorithms.
Keywords: Machine Learning, Diabetes, Supervised
learning, Unsupervised Learning, Semi-supervisedLearning,
Reinforcement Learning
1. INTRODUCTION
Computers may improve their performance utilizing
minimal data thanks to the science of machine learning. This
field concentrates on learning and understanding through
the study of computer systems. Machine learning works on
two methods: training and testing. Machine learning
technology has improved extraordinarilyina fewdecades to
predict the disease of patients by looking at their symptoms.
Machine learning technology serves as a hugeblessingtothe
healthcare sector in identifying healthissueseasily.Machine
learning uses the various symptoms of the patients as an
input and gives an output based on that. Machine learning
helps doctors to detect a disease at an early stage. For
example, Heart-related problems, Cancer, etc. Healthcare
cases depend a lot on machine learning for diagnosis and
analysis. As a result, it can be said that healthcare is one of
the most important uses of machine learning in the medical
field.
2. EXISTING SYSTEMS
Despite the existence of advanced computer systems,
medicos still need proficiency in X-rays and surgical
operations, however, technology optically discerns it back
after understanding. This process still depends on their
knowledge and experienceinthemedical fieldtounderstand
the factors starting from past medical history, atmosphere,
sugar, blood pressure, weather, and some different aspects.
A significant number of variables are offered as it takes a lot
to comprehend the full process, but no model is
prosperouslyanalyzed.Wecanmanufacturemodelsby using
machine learning medical records to give outputs efficiently
by analyzing the data. Machine learninghelpsdoctorsto give
many precise decisions for treatment choices, which results
in the improvement of patient’s health.
3. PROPOSED SYSTEM
The proposed system allows us to detect the disease by
analyzing the symptoms. This system utilizes many
supervised learning methods for the assessment of the
model. Diseases are identified by this system based on the
said symptoms. This system is operated with the help of
machine learning technology. Diseases like breast cancer,
diabetes, kidney diseases, liver diseases, and heart diseases
can be predicted using this system.
4. DIABETES DISEASES
Diabetes is a metabolic disease that causes a rise in blood
sugar. People having diabetes suffer from dangerous
diseases related to the heart, kidneys, and sight; that’s why
it’s a lot more dangerous for them. Lack of energy, feeling
thirsty, blurry eyesight, lightweight, urinating more often,
etc. are some of the symptoms of diabetes. The health
centres accumulate the obligatory data for the diagnosis of
the disease through tests and the treatments that are
required based on this. The healthcare field depends on the
information they gather by analyzing the disease and later
operating based on that. This sector is predicated on an
immense amplitude of data. Utilizing sizably voluminous
data analytics that works with immense data sets and
recovers rear data and the connections to process those
received data to give an accurate output. Some of the effects
and symptoms of diabetes are given below:
• Women suffering from any type of diabetes have
high blood sugar during their pregnancy which risks the
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 08 | Aug 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1879
birth of the baby. The health of the mother can also
deteriorate a lot because of this.
• Diabetes causes a huge rise in blood glucose.Ahuge
amount of glucose in the blood cangiverisetovarioushealth
problems after some time.
• Diabetic patients suffer much more fromhighblood
pressure than those without it. Without any treatments, this
high blood pressure can cause heart disease and strokes.
• Skin sickness is one of the evident symptoms of
diabetes. A high rise in blood glucose leads to inadequate
blood flow which causes severe skin damage. Destruction of
skin cells leads to high sensitivity towards temperature and
pressure.
• Type 1 diabetes happens due to the lack of
production of insulin by the cells of the pancreas. As a result,
glucose can’t provide energy in the body. Thus, type 1
diabetes can take lives if insulin is not injected for its
lifetime.
• A diabetic patient needs to have a Body Mass Index
(BMI) of 25 or less and maintain a healthydiettopreventthe
disease.
• Middle-aged persons have a higher prevalence of
type 2 diabetes. However, today's youth are also growing
greatly.
• Diabetes Pedigree Function is used to indicate
whether the disease has been passed down to the children
through their parents or not.
Fig1: Diabetes
5. DIFFERENT TYPES OF LEARNING
5.1 Supervised learning
It is a method where a machine is taught using well-labeled
data. This well-labeled data is targeted to train the
algorithms into classifying data or predicting the accurate
result. The processes under the supervised learning
algorithm are
Classification – It is a type of algorithm that classifies a
dataset.
Regression –It is a particular kind of supervised learning
technique used to forecast continuous outcomes.
5.2 Unsupervised Learning
It is a method in which the machine works without
supervision. Since the data, in this case, is unlabeled, the
machine must explore the dataset and look for hidden
patterns to anticipate the output without human
involvement. The types under unsupervised learning are:
Clustering: It is a type of data mining technique for grouping
unlabelled data predicated on their homogeneousattributes
or differences.
Association: It is another type of unsupervised learning that
uses different rules and discovers patterns in data.
5.3 Semi-supervised Learning
Semi-supervised learning refers to learning that combines
supervised and unsupervised methods. In semi-supervised
learning, the data is fused with a littleamountoflabeleddata
with a myriad amount of unlabelled data.
5.4 Reinforcement Learning
Reinforcement learning is a type of machine learning where
an agent learns to interact with its circumventing
environment by invoking actions and discovering errors or
rewards. It’s all about taking appropriate action and
maximizing as many rewards as possible in a particular
situation.
Fig 2: Types of Machine Learning
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 08 | Aug 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1880
6. MACHINE LEARNING MODEL DEVELOPMENT
TECHNIQUES
The procedures required in creating a machine learning
algorithm are as follows:
6.1 Data access and collection
The first step in building a model is to gather data from
numerous sources and sort out features that have the most
impact on the study. This is a crucial stepbecausethequality
and quantity of the information we collect will directly
determine how effective the predictive model can be.
6.2 Data preparation
It's time to assess the data now that it has been collected.
This aids in avoiding significant problems and ensuring
better outcomes. We may also pre-process at this level by
doing the following actions:
Data formatting: Data must be recordedina standardformat
that our machine learning can comprehend, such as XML,
CSV, etc., to use the gathered data in machine learning.
Data Cleaning: Data cleaning is mostly utilized in machine
learning algorithms to remove noise, inaccurate data, and
duplicate data from our dataset.
Data Reduction: Minimizing the sample data for obtaining a
better predictive model in machine learning.
6.3 Data Transformation
After data cleaning, we convert the clean data into another
format that will allow us to recover information. This is a
crucial step since it improves the prediction model's
precision.
6.4 Model Training
In model training, the collection of clean andprocesseddata
is divided into training data and testing data. The training
dataset is used to train the model, whilethetestingdatasetis
used to assess the model's performance on the data. Due to
the usage of labeled data here, this characteristic is not
present in unsupervised learning.
6.5 Evaluation
The model that was created after the algorithm was tested
might now be utilized to make predictions in real-time.
7. TYPES OF SUPERVISED MACHINE LEARNING
ALGORITHMS
7.1 LOGISTIC REGRESSION
The statistical method utilized in machinelearningislogistic
regression. It is more frequently employed to address
categorization issues.Logisticregressionisusedtoforecasta
dependent variable's categorical output. The result must
thus be a discrete or categorical value. It can be either True
or False, Yes or No, 0 or 1, etc., but rather than providing the
precise values of 0 and 1, it provides the probability values
that fall between 0 and 1.
7.2 RANDOM FOREST
Random Forest or random decision forest is constructed by
utilizing multiple decision trees and the final decision is
obtained by majority votes of the decision trees. It may be
utilized to address issues with regression or classification. .
The key benefits provided by random forest when used in
regression or classification problems are Minimizing
overfitting: Decision trees that are extremely massive run
the risk of overfitting within training data, thus causing the
classification output to drasticallydifferfora small changein
the input value. They are hypersensitive to their training
data, which may result in more errors in the test dataset
(value).
7.3 NAIVE BAYES
Based on the Bayes theorem, the probabilistic supervised
machine learning technique knownasNaiveBayesisutilized
to resolve classification issues. The Bayes Theorem
calculates the probability that an event will occur given the
probability that an earlier event will occur. The following
equation provides the mathematical version of Bayes'
theorem: P(A|B) =P(B|A) P(A) /P (B)
According to the equation above, we may say that:
P(B|A) is the posterior probabilityofthegivenclasswhereB
is the target and A is the attributes. P (B) denotes the class
prior probability.
P(A|B) is the chance of predictor given class.
7.4 SUPPORT VECTOR MACHINE
Both classification and regression are performed using a
supervised machine learning technique known as the
Support Vector Machine (SVM). In the beginning, each data
point is mapped into an n-dimensional attribute space (n
being the total number of attributes). The process then
determines the hyperplane that splits the data points into
two classes (divisions), maximizing the margin distance
between the classes(divisions)andminimizingclassification
mistakes. The value of each attribute is then calculated to be
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 08 | Aug 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1881
equal to the value of that specific coordinate, transforming
each data point into a point in an n-dimensional space. The
next step is to look for the hyperplane that separatesthetwo
classes by the greatest margin.
7.5 K-NEAREST NEIGHBOUR
K-Nearest Neighbour is one of the most straightforward
machine learning algorithms that can do both classification
and regression tasks. Its foundation is the supervised
learning approach. . K-NN makes no assumptions about the
underlying data because it is a non-parametric approach. As
a result of saving the training dataset rather than instantly
learning from it, the method is sometimes referred to as a
lazy learner. Instead, it acts while categorizing data by using
the dataset. KNN categorizes fresh data into a category that
is relatively close to the training data by merely saving the
information during the training phase.
7.6 XG BOOST
XG Boost which stands for Extreme Gradient Boosting is a
distributed, scalable gradient-boosted decision tree (GBDT)
machine learning framework. Artificial neural networks
frequently outperform all otheralgorithmsorframeworksin
prediction issues involving non-structured data. However,
decision tree-based algorithms are currently thought to be
best-in-class for small- to medium-sized structured/tabular
data
8. Evaluating the model and results
Summarizing the results obtained from our model in the
following table:
Disease /
Algorithms
Diabetes
Random Forest Classifiers 77
Logistic Regression 80
KNN 74
Naive Bayes 75.97
Support Vector Machine 76
XG Boost 79.87
9. LITERATURE SURVEY
In Paper 1: Disease Prediction using Machine Learning: In
the year 2019, this paper was published by Kedar Pingale,
Sushant Surwase, Vaibhav Kulkarni, Saurabh Storage, Prof.
Abhijeet Karve. In their paper, they used the data provided
by the patients as input and gives the probability of disease
as output. They have used the Naive Bayes Classifier to
predict the disease. With the improvement of technology in
the medical field, doctors can now detect a disease at an
early stage.
In Paper 2: Diabetes diagnosis usingmachinelearning:Inthe
year 2021, this paper was published by Boshra Farajollahi,
Maysam Mehmannavaz, Hafez Mehrjoo, Fateme Moghbeli,
Mohammad Javad Sayadi. Intheir paper,theyhavediscussed
the chronic disease diabetes which is dangerous for our
health. They wanted to evaluate the performancesoflogistic
regression (LR), decision tree (DT), random forest (RF), and
other models for diabetes classification. Through this, they
wanted to differentiate between various methods of
detecting disease.
Paper 3: Multiple Disease Prognostication Based on
Symptoms Using Machine Learning Techniques: This paper
was published by Kajal Patil, Sakshee Pawar, Pramita
Sandhyan, and Jyoti Kundale in the year 2022. In this paper,
they have discussed the importance of health care for every
human being. They also made a model which can process
output by using a vast source of information. Since the
output will be based on the collected data, and that’s why it
will be different for every person.
In Paper 4: Using Three Machine Learning Techniques for
Predicting Breast Cancer Recurrence: This paper was
published by Ahmad LG, Eshlaghy AT, Poorebrahimi A,
Ebrahimi M, and Razavi AR in the year 2019. In this paper,
they have discussed the models they made to predict the
presence of breast cancer by analyzing the information they
collected. To evaluate the performance of the models, their
accuracy, sensitivity, and otherfeatureswerecompared. The
models are used for the early detection of the disease.
10. CONCLUSION
In this study, we investigated the use of ML-based
techniques in healthcare. To do this, we first provided an
overview of machine learning and its use in healthcare. We
categorized machine learning (ML)-based approaches in
medicine based on learning techniques (unsupervised
learning, supervised learning, semi-supervisedlearning, and
reinforcement learning), data pre-processing approaches
(data cleaning approaches, data reduction approaches, and
data formatting approaches), and assessment approaches.
Because medical data is developing at an increasing rate,
current data must be processed to forecast precise illnesses
based on symptoms. To categorize patient data, several
generic illness prediction systems based on machine
learning algorithms, including Logistic Regression, Random
Forest, Naive Bayes, Support Vector Machine, K-Nearest
Neighbour, and XG Boost have been presented.
11. REFERENCES
1) Pingale, K., Surwase, S., Kulkarni, V., Sarage, S., & Karve, A.
(2019). Disease Prediction using Machine Learning.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 08 | Aug 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1882
2) Balasubramanian, Satyabhama, and Balaji Subramanian.
"Symptom-based disease prediction in the medical system
by using Kmeans algorithm." International Journal of
Advances in Computer Science and Technology 3
3) Abu-Jamous, B., Fa, R., Nandi, A.K., 2015a. Feature
Selection.IntegrativeCluster AnalysisinBioinformatics.John
Wiley & Sons, Ltd.
4) A. K. Jain and R. C. Dubes. Algorithms for clustering data.
Prentice-Hall, Inc., 1988
5) Tapak L, Mahjub H, Hamidi O, Poorolajal J. Real-data
comparison of data miningmethodsinpredictionofdiabetes
in Iran. Healthc Inform Res. 2013; 19(3):177–85.
6) Ahmad LG, Eshlaghy A, Poorebrahimi A, Ebrahimi M,
Razavi A. Using three machine learning techniques for
predicting breast cancer recurrence. J Health Med Inform.
2013;4(124)
7) Toshniwal D, Goel B, Sharma H. Multistage Classification
for Cardiovascular Disease Risk Prediction. In: International
Conference on Big Data Analytics; 2015.p.258–66.Springer.
8) Mustaqeem A, Anwar SM, Majid M, Khan AR. Wrapper
method for feature selection to classify cardiac arrhythmia.
In: Engineering in Medicine and Biology Society (EMBC),
39th Annual International Conference of the IEEE; 2017. p.
3656–9. IEEE.

More Related Content

Similar to A Machine Learning Model for Diabetes

Similar to A Machine Learning Model for Diabetes (20)

PREDICTION OF DIABETES (SUGAR) USING MACHINE LEARNING TECHNIQUES
PREDICTION OF DIABETES (SUGAR) USING MACHINE LEARNING TECHNIQUESPREDICTION OF DIABETES (SUGAR) USING MACHINE LEARNING TECHNIQUES
PREDICTION OF DIABETES (SUGAR) USING MACHINE LEARNING TECHNIQUES
 
IRJET- Diabetes Diagnosis using Machine Learning Algorithms
IRJET- Diabetes Diagnosis using Machine Learning AlgorithmsIRJET- Diabetes Diagnosis using Machine Learning Algorithms
IRJET- Diabetes Diagnosis using Machine Learning Algorithms
 
Implementation of a Web Application to Foresee and Pretreat Diabetes Mellitus...
Implementation of a Web Application to Foresee and Pretreat Diabetes Mellitus...Implementation of a Web Application to Foresee and Pretreat Diabetes Mellitus...
Implementation of a Web Application to Foresee and Pretreat Diabetes Mellitus...
 
HealthCare Chatbot
HealthCare ChatbotHealthCare Chatbot
HealthCare Chatbot
 
ELECTRONIC HEALTH RECORD USING THE SPARSE BALANCE SUPPORT VECTOR MACHINE FOR ...
ELECTRONIC HEALTH RECORD USING THE SPARSE BALANCE SUPPORT VECTOR MACHINE FOR ...ELECTRONIC HEALTH RECORD USING THE SPARSE BALANCE SUPPORT VECTOR MACHINE FOR ...
ELECTRONIC HEALTH RECORD USING THE SPARSE BALANCE SUPPORT VECTOR MACHINE FOR ...
 
ML In Predicting Diabetes In The Early Stage
ML In Predicting Diabetes In The Early StageML In Predicting Diabetes In The Early Stage
ML In Predicting Diabetes In The Early Stage
 
A COMPREHENSIVE SURVEY ON CARDIAC ARREST RISK LEVEL PREDICTION SYSTEM
A COMPREHENSIVE SURVEY ON CARDIAC ARREST RISK LEVEL PREDICTION SYSTEMA COMPREHENSIVE SURVEY ON CARDIAC ARREST RISK LEVEL PREDICTION SYSTEM
A COMPREHENSIVE SURVEY ON CARDIAC ARREST RISK LEVEL PREDICTION SYSTEM
 
Multi Disease Detection using Deep Learning
Multi Disease Detection using Deep LearningMulti Disease Detection using Deep Learning
Multi Disease Detection using Deep Learning
 
IRJET - Prediction and Analysis of Multiple Diseases using Machine Learni...
IRJET -  	  Prediction and Analysis of Multiple Diseases using Machine Learni...IRJET -  	  Prediction and Analysis of Multiple Diseases using Machine Learni...
IRJET - Prediction and Analysis of Multiple Diseases using Machine Learni...
 
IRJET- Predicting Diabetes Disease using Effective Classification Techniques
IRJET-  	  Predicting Diabetes Disease using Effective Classification TechniquesIRJET-  	  Predicting Diabetes Disease using Effective Classification Techniques
IRJET- Predicting Diabetes Disease using Effective Classification Techniques
 
Cloud based Health Prediction System
Cloud based Health Prediction SystemCloud based Health Prediction System
Cloud based Health Prediction System
 
IRJET - Deep Multiple Instance Learning for Automatic Detection of Diabetic R...
IRJET - Deep Multiple Instance Learning for Automatic Detection of Diabetic R...IRJET - Deep Multiple Instance Learning for Automatic Detection of Diabetic R...
IRJET - Deep Multiple Instance Learning for Automatic Detection of Diabetic R...
 
IRJET- Chronic Kidney Disease Prediction based on Naive Bayes Technique
IRJET- Chronic Kidney Disease Prediction based on Naive Bayes TechniqueIRJET- Chronic Kidney Disease Prediction based on Naive Bayes Technique
IRJET- Chronic Kidney Disease Prediction based on Naive Bayes Technique
 
A CONCEPTUAL APPROACH TO ENHANCE PREDICTION OF DIABETES USING ALTERNATE FEATU...
A CONCEPTUAL APPROACH TO ENHANCE PREDICTION OF DIABETES USING ALTERNATE FEATU...A CONCEPTUAL APPROACH TO ENHANCE PREDICTION OF DIABETES USING ALTERNATE FEATU...
A CONCEPTUAL APPROACH TO ENHANCE PREDICTION OF DIABETES USING ALTERNATE FEATU...
 
A CONCEPTUAL APPROACH TO ENHANCE PREDICTION OF DIABETES USING ALTERNATE FEATU...
A CONCEPTUAL APPROACH TO ENHANCE PREDICTION OF DIABETES USING ALTERNATE FEATU...A CONCEPTUAL APPROACH TO ENHANCE PREDICTION OF DIABETES USING ALTERNATE FEATU...
A CONCEPTUAL APPROACH TO ENHANCE PREDICTION OF DIABETES USING ALTERNATE FEATU...
 
EYE DISEASE IDENTIFICATION USING DEEP LEARNING
EYE DISEASE IDENTIFICATION USING DEEP LEARNINGEYE DISEASE IDENTIFICATION USING DEEP LEARNING
EYE DISEASE IDENTIFICATION USING DEEP LEARNING
 
“Detection of Diseases using Machine Learning”
“Detection of Diseases using Machine Learning”“Detection of Diseases using Machine Learning”
“Detection of Diseases using Machine Learning”
 
IRJET- Diabetes Mellitus Diagnosis using Artificial Intelligence Techniques C...
IRJET- Diabetes Mellitus Diagnosis using Artificial Intelligence Techniques C...IRJET- Diabetes Mellitus Diagnosis using Artificial Intelligence Techniques C...
IRJET- Diabetes Mellitus Diagnosis using Artificial Intelligence Techniques C...
 
Smart Healthcare Prediction System Using Machine Learning
Smart Healthcare Prediction System Using Machine LearningSmart Healthcare Prediction System Using Machine Learning
Smart Healthcare Prediction System Using Machine Learning
 
Diagnosis of Diabetes Mellitus Using Machine Learning Techniques
Diagnosis of Diabetes Mellitus Using Machine Learning TechniquesDiagnosis of Diabetes Mellitus Using Machine Learning Techniques
Diagnosis of Diabetes Mellitus Using Machine Learning Techniques
 

More from IRJET Journal

More from IRJET Journal (20)

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web application
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
 

Recently uploaded

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
ssuser89054b
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
Epec Engineered Technologies
 

Recently uploaded (20)

Online food ordering system project report.pdf
Online food ordering system project report.pdfOnline food ordering system project report.pdf
Online food ordering system project report.pdf
 
Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...
Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...
Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...
 
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxA CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best ServiceTamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
 
💚Trustworthy Call Girls Pune Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top...
💚Trustworthy Call Girls Pune Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top...💚Trustworthy Call Girls Pune Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top...
💚Trustworthy Call Girls Pune Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top...
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptxOrlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptx
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
PE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and propertiesPE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and properties
 
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARHAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
 
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsFEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
 
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxHOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna Municipality
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
Online electricity billing project report..pdf
Online electricity billing project report..pdfOnline electricity billing project report..pdf
Online electricity billing project report..pdf
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
 
AIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsAIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech students
 

A Machine Learning Model for Diabetes

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 08 | Aug 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1878 A Machine Learning Model for Diabetes Josita Sengupta1, Koushik Pal2, Mallika Roy3, Jishnu Nath Paul4 Department of Electronics& Communication Engineering, Guru Nanak Institute of Technology, Kolkata, India ----------------------------------------------------------------------------***---------------------------------------------------------------------------- Abstract - The prediction of diseases benefits greatly from machine learning methods. Based on the numerous symptoms that users enter as input to the system, the Ailment Prediction system uses predictive modeling to anticipate the user's disease. The algorithm analyses the user's vividsymptomsand returns the likelihood that the disease will occur as an output. Different Supervised Machine Learning methods are used for disease prediction. Accurate medical data analysis enhances patient care and early illness identification as big data usage grows in the biomedical and healthcare industries. This project's overarching goal is to help academics and professionals choose an appropriate machine learning algorithm for the healthcare industry. This study aims to give essential knowledge about the supervised machine learning algorithms utilized in the healthcare industry. As a result of compiling a data table on the efficacy of learning algorithms for various diseases, wewere abletocomparewhichalgorithm worked best for which specific type of condition. These publications will aid researchers and practitioners in understanding the role supervised machine learning algorithms have in the healthcareindustryandtheaccuracyof various supervised machine learning algorithms. Keywords: Machine Learning, Diabetes, Supervised learning, Unsupervised Learning, Semi-supervisedLearning, Reinforcement Learning 1. INTRODUCTION Computers may improve their performance utilizing minimal data thanks to the science of machine learning. This field concentrates on learning and understanding through the study of computer systems. Machine learning works on two methods: training and testing. Machine learning technology has improved extraordinarilyina fewdecades to predict the disease of patients by looking at their symptoms. Machine learning technology serves as a hugeblessingtothe healthcare sector in identifying healthissueseasily.Machine learning uses the various symptoms of the patients as an input and gives an output based on that. Machine learning helps doctors to detect a disease at an early stage. For example, Heart-related problems, Cancer, etc. Healthcare cases depend a lot on machine learning for diagnosis and analysis. As a result, it can be said that healthcare is one of the most important uses of machine learning in the medical field. 2. EXISTING SYSTEMS Despite the existence of advanced computer systems, medicos still need proficiency in X-rays and surgical operations, however, technology optically discerns it back after understanding. This process still depends on their knowledge and experienceinthemedical fieldtounderstand the factors starting from past medical history, atmosphere, sugar, blood pressure, weather, and some different aspects. A significant number of variables are offered as it takes a lot to comprehend the full process, but no model is prosperouslyanalyzed.Wecanmanufacturemodelsby using machine learning medical records to give outputs efficiently by analyzing the data. Machine learninghelpsdoctorsto give many precise decisions for treatment choices, which results in the improvement of patient’s health. 3. PROPOSED SYSTEM The proposed system allows us to detect the disease by analyzing the symptoms. This system utilizes many supervised learning methods for the assessment of the model. Diseases are identified by this system based on the said symptoms. This system is operated with the help of machine learning technology. Diseases like breast cancer, diabetes, kidney diseases, liver diseases, and heart diseases can be predicted using this system. 4. DIABETES DISEASES Diabetes is a metabolic disease that causes a rise in blood sugar. People having diabetes suffer from dangerous diseases related to the heart, kidneys, and sight; that’s why it’s a lot more dangerous for them. Lack of energy, feeling thirsty, blurry eyesight, lightweight, urinating more often, etc. are some of the symptoms of diabetes. The health centres accumulate the obligatory data for the diagnosis of the disease through tests and the treatments that are required based on this. The healthcare field depends on the information they gather by analyzing the disease and later operating based on that. This sector is predicated on an immense amplitude of data. Utilizing sizably voluminous data analytics that works with immense data sets and recovers rear data and the connections to process those received data to give an accurate output. Some of the effects and symptoms of diabetes are given below: • Women suffering from any type of diabetes have high blood sugar during their pregnancy which risks the
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 08 | Aug 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1879 birth of the baby. The health of the mother can also deteriorate a lot because of this. • Diabetes causes a huge rise in blood glucose.Ahuge amount of glucose in the blood cangiverisetovarioushealth problems after some time. • Diabetic patients suffer much more fromhighblood pressure than those without it. Without any treatments, this high blood pressure can cause heart disease and strokes. • Skin sickness is one of the evident symptoms of diabetes. A high rise in blood glucose leads to inadequate blood flow which causes severe skin damage. Destruction of skin cells leads to high sensitivity towards temperature and pressure. • Type 1 diabetes happens due to the lack of production of insulin by the cells of the pancreas. As a result, glucose can’t provide energy in the body. Thus, type 1 diabetes can take lives if insulin is not injected for its lifetime. • A diabetic patient needs to have a Body Mass Index (BMI) of 25 or less and maintain a healthydiettopreventthe disease. • Middle-aged persons have a higher prevalence of type 2 diabetes. However, today's youth are also growing greatly. • Diabetes Pedigree Function is used to indicate whether the disease has been passed down to the children through their parents or not. Fig1: Diabetes 5. DIFFERENT TYPES OF LEARNING 5.1 Supervised learning It is a method where a machine is taught using well-labeled data. This well-labeled data is targeted to train the algorithms into classifying data or predicting the accurate result. The processes under the supervised learning algorithm are Classification – It is a type of algorithm that classifies a dataset. Regression –It is a particular kind of supervised learning technique used to forecast continuous outcomes. 5.2 Unsupervised Learning It is a method in which the machine works without supervision. Since the data, in this case, is unlabeled, the machine must explore the dataset and look for hidden patterns to anticipate the output without human involvement. The types under unsupervised learning are: Clustering: It is a type of data mining technique for grouping unlabelled data predicated on their homogeneousattributes or differences. Association: It is another type of unsupervised learning that uses different rules and discovers patterns in data. 5.3 Semi-supervised Learning Semi-supervised learning refers to learning that combines supervised and unsupervised methods. In semi-supervised learning, the data is fused with a littleamountoflabeleddata with a myriad amount of unlabelled data. 5.4 Reinforcement Learning Reinforcement learning is a type of machine learning where an agent learns to interact with its circumventing environment by invoking actions and discovering errors or rewards. It’s all about taking appropriate action and maximizing as many rewards as possible in a particular situation. Fig 2: Types of Machine Learning
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 08 | Aug 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1880 6. MACHINE LEARNING MODEL DEVELOPMENT TECHNIQUES The procedures required in creating a machine learning algorithm are as follows: 6.1 Data access and collection The first step in building a model is to gather data from numerous sources and sort out features that have the most impact on the study. This is a crucial stepbecausethequality and quantity of the information we collect will directly determine how effective the predictive model can be. 6.2 Data preparation It's time to assess the data now that it has been collected. This aids in avoiding significant problems and ensuring better outcomes. We may also pre-process at this level by doing the following actions: Data formatting: Data must be recordedina standardformat that our machine learning can comprehend, such as XML, CSV, etc., to use the gathered data in machine learning. Data Cleaning: Data cleaning is mostly utilized in machine learning algorithms to remove noise, inaccurate data, and duplicate data from our dataset. Data Reduction: Minimizing the sample data for obtaining a better predictive model in machine learning. 6.3 Data Transformation After data cleaning, we convert the clean data into another format that will allow us to recover information. This is a crucial step since it improves the prediction model's precision. 6.4 Model Training In model training, the collection of clean andprocesseddata is divided into training data and testing data. The training dataset is used to train the model, whilethetestingdatasetis used to assess the model's performance on the data. Due to the usage of labeled data here, this characteristic is not present in unsupervised learning. 6.5 Evaluation The model that was created after the algorithm was tested might now be utilized to make predictions in real-time. 7. TYPES OF SUPERVISED MACHINE LEARNING ALGORITHMS 7.1 LOGISTIC REGRESSION The statistical method utilized in machinelearningislogistic regression. It is more frequently employed to address categorization issues.Logisticregressionisusedtoforecasta dependent variable's categorical output. The result must thus be a discrete or categorical value. It can be either True or False, Yes or No, 0 or 1, etc., but rather than providing the precise values of 0 and 1, it provides the probability values that fall between 0 and 1. 7.2 RANDOM FOREST Random Forest or random decision forest is constructed by utilizing multiple decision trees and the final decision is obtained by majority votes of the decision trees. It may be utilized to address issues with regression or classification. . The key benefits provided by random forest when used in regression or classification problems are Minimizing overfitting: Decision trees that are extremely massive run the risk of overfitting within training data, thus causing the classification output to drasticallydifferfora small changein the input value. They are hypersensitive to their training data, which may result in more errors in the test dataset (value). 7.3 NAIVE BAYES Based on the Bayes theorem, the probabilistic supervised machine learning technique knownasNaiveBayesisutilized to resolve classification issues. The Bayes Theorem calculates the probability that an event will occur given the probability that an earlier event will occur. The following equation provides the mathematical version of Bayes' theorem: P(A|B) =P(B|A) P(A) /P (B) According to the equation above, we may say that: P(B|A) is the posterior probabilityofthegivenclasswhereB is the target and A is the attributes. P (B) denotes the class prior probability. P(A|B) is the chance of predictor given class. 7.4 SUPPORT VECTOR MACHINE Both classification and regression are performed using a supervised machine learning technique known as the Support Vector Machine (SVM). In the beginning, each data point is mapped into an n-dimensional attribute space (n being the total number of attributes). The process then determines the hyperplane that splits the data points into two classes (divisions), maximizing the margin distance between the classes(divisions)andminimizingclassification mistakes. The value of each attribute is then calculated to be
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 08 | Aug 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1881 equal to the value of that specific coordinate, transforming each data point into a point in an n-dimensional space. The next step is to look for the hyperplane that separatesthetwo classes by the greatest margin. 7.5 K-NEAREST NEIGHBOUR K-Nearest Neighbour is one of the most straightforward machine learning algorithms that can do both classification and regression tasks. Its foundation is the supervised learning approach. . K-NN makes no assumptions about the underlying data because it is a non-parametric approach. As a result of saving the training dataset rather than instantly learning from it, the method is sometimes referred to as a lazy learner. Instead, it acts while categorizing data by using the dataset. KNN categorizes fresh data into a category that is relatively close to the training data by merely saving the information during the training phase. 7.6 XG BOOST XG Boost which stands for Extreme Gradient Boosting is a distributed, scalable gradient-boosted decision tree (GBDT) machine learning framework. Artificial neural networks frequently outperform all otheralgorithmsorframeworksin prediction issues involving non-structured data. However, decision tree-based algorithms are currently thought to be best-in-class for small- to medium-sized structured/tabular data 8. Evaluating the model and results Summarizing the results obtained from our model in the following table: Disease / Algorithms Diabetes Random Forest Classifiers 77 Logistic Regression 80 KNN 74 Naive Bayes 75.97 Support Vector Machine 76 XG Boost 79.87 9. LITERATURE SURVEY In Paper 1: Disease Prediction using Machine Learning: In the year 2019, this paper was published by Kedar Pingale, Sushant Surwase, Vaibhav Kulkarni, Saurabh Storage, Prof. Abhijeet Karve. In their paper, they used the data provided by the patients as input and gives the probability of disease as output. They have used the Naive Bayes Classifier to predict the disease. With the improvement of technology in the medical field, doctors can now detect a disease at an early stage. In Paper 2: Diabetes diagnosis usingmachinelearning:Inthe year 2021, this paper was published by Boshra Farajollahi, Maysam Mehmannavaz, Hafez Mehrjoo, Fateme Moghbeli, Mohammad Javad Sayadi. Intheir paper,theyhavediscussed the chronic disease diabetes which is dangerous for our health. They wanted to evaluate the performancesoflogistic regression (LR), decision tree (DT), random forest (RF), and other models for diabetes classification. Through this, they wanted to differentiate between various methods of detecting disease. Paper 3: Multiple Disease Prognostication Based on Symptoms Using Machine Learning Techniques: This paper was published by Kajal Patil, Sakshee Pawar, Pramita Sandhyan, and Jyoti Kundale in the year 2022. In this paper, they have discussed the importance of health care for every human being. They also made a model which can process output by using a vast source of information. Since the output will be based on the collected data, and that’s why it will be different for every person. In Paper 4: Using Three Machine Learning Techniques for Predicting Breast Cancer Recurrence: This paper was published by Ahmad LG, Eshlaghy AT, Poorebrahimi A, Ebrahimi M, and Razavi AR in the year 2019. In this paper, they have discussed the models they made to predict the presence of breast cancer by analyzing the information they collected. To evaluate the performance of the models, their accuracy, sensitivity, and otherfeatureswerecompared. The models are used for the early detection of the disease. 10. CONCLUSION In this study, we investigated the use of ML-based techniques in healthcare. To do this, we first provided an overview of machine learning and its use in healthcare. We categorized machine learning (ML)-based approaches in medicine based on learning techniques (unsupervised learning, supervised learning, semi-supervisedlearning, and reinforcement learning), data pre-processing approaches (data cleaning approaches, data reduction approaches, and data formatting approaches), and assessment approaches. Because medical data is developing at an increasing rate, current data must be processed to forecast precise illnesses based on symptoms. To categorize patient data, several generic illness prediction systems based on machine learning algorithms, including Logistic Regression, Random Forest, Naive Bayes, Support Vector Machine, K-Nearest Neighbour, and XG Boost have been presented. 11. REFERENCES 1) Pingale, K., Surwase, S., Kulkarni, V., Sarage, S., & Karve, A. (2019). Disease Prediction using Machine Learning.
  • 5. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 08 | Aug 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1882 2) Balasubramanian, Satyabhama, and Balaji Subramanian. "Symptom-based disease prediction in the medical system by using Kmeans algorithm." International Journal of Advances in Computer Science and Technology 3 3) Abu-Jamous, B., Fa, R., Nandi, A.K., 2015a. Feature Selection.IntegrativeCluster AnalysisinBioinformatics.John Wiley & Sons, Ltd. 4) A. K. Jain and R. C. Dubes. Algorithms for clustering data. Prentice-Hall, Inc., 1988 5) Tapak L, Mahjub H, Hamidi O, Poorolajal J. Real-data comparison of data miningmethodsinpredictionofdiabetes in Iran. Healthc Inform Res. 2013; 19(3):177–85. 6) Ahmad LG, Eshlaghy A, Poorebrahimi A, Ebrahimi M, Razavi A. Using three machine learning techniques for predicting breast cancer recurrence. J Health Med Inform. 2013;4(124) 7) Toshniwal D, Goel B, Sharma H. Multistage Classification for Cardiovascular Disease Risk Prediction. In: International Conference on Big Data Analytics; 2015.p.258–66.Springer. 8) Mustaqeem A, Anwar SM, Majid M, Khan AR. Wrapper method for feature selection to classify cardiac arrhythmia. In: Engineering in Medicine and Biology Society (EMBC), 39th Annual International Conference of the IEEE; 2017. p. 3656–9. IEEE.