2. INTRODUCTION
• Cardiac arrhythmia, a prevalent and potentially life-threatening condition, pertains to
irregularities in the rhythm of the heart's electrical impulses.
• The human heart, a marvel of physiological engineering, relies on a precisely coordinated
sequence of electrical signals to contract and pump blood efficiently.
• When this orchestration falters, it can result in arrhythmias, disrupting the normal
heartbeat and potentially leading to severe complications, including strokes and heart
failure.
• With the rising incidence of cardiovascular diseases globally, the timely and accurate
detection of cardiac arrhythmias has become paramount for effective clinical intervention
and patient care.
• In the realm of cardiac arrhythmia detection, this project seeks to leverage advanced
machine learning techniques to enhance the diagnostic capabilities of healthcare
professionals.
• The dataset under examination comprises a comprehensive array of 278 features collected
from 1356 patient samples.
3. INTRODUCTION (CONTD…)
• Through a meticulous process of feature selection using the random forest algorithm, the project
identifies the principal determinants of arrhythmias, paving the way for a streamlined and focused
analysis.
• Subsequently, a suite of classification algorithms, including weighted k-nearest neighbors (kNN),
is employed on the reduced feature set to discern and categorize different types of arrhythmias.
• The emphasis lies not only on achieving high accuracy but also on robust generalization to diverse
patient profiles.
• The significance of this project transcends its immediate technical implications, addressing a
critical healthcare challenge. Accurate arrhythmia detection enables healthcare providers to tailor
interventions, administer appropriate treatments, and potentially prevent life-threatening cardiac
events.
• Moreover, the incorporation of machine learning in this domain showcases the convergence of
medical science and technology, promising a paradigm shift in the way we approach cardiovascular
health.
• As the project unfolds, it is poised to contribute valuable insights to the burgeoning field of digital
health, fostering innovations that could revolutionize the landscape of cardiovascular care.
4. classification types
Class code : Class : Number of instances:
01 Normal 245
02 Ischemic changes (Coronary Artery Disease) 44
03 Old Anterior Myocardial Infarction 15
04 Old Inferior Myocardial Infarction 15
05 Sinus tachycardy 13
06 Sinus bradycardy 25
07 Ventricular Premature Contraction (PVC) 3
08 Supraventricular Premature Contraction 2
09 Left bundle branch block 9
10 Right bundle branch block 50
11 1. degree AtrioVentricular block 0
12 2. degree AV block 0
13 3. degree AV block 0
14 Left ventricule hypertrophy 4
15 Atrial Fibrillation or Flutter 5
16 Others 22
5. INPUT ATTRIBUTES IN RAW DATASET
• 2 Sex: Sex (0 = male; 1 = female) , nominal
• 3 Height: Height in centimeters , linear
• 4 Weight: Weight in kilograms , linear
• 5 QRS duration: Average of QRS duration in msec., linear
• 6 P-R interval: Average duration between onset of P and Q
waves in msec., linear
• 7 Q-T interval: Average duration between onset of Q and
offset of T waves in msec., linear
• 8 T interval: Average duration of T wave in msec., linear
• 9 P interval: Average duration of P wave in msec., linear
• Vector angles in degrees on front plane of:, linear
• 10 QRS
• 11 T
• 12 P
• 13 QRST
• 14 J
• 15 Heart rate: Number of heart beats per minute ,linear
•
• Of channel DI:
• Average width, in msec., of: linear
• 16 Q wave
• 17 R wave
• 18 S wave
• 19 R' wave, small peak just after R
• 20 S' wave
• 21 Number of intrinsic deflections, linear
• 22 Existence of ragged R wave, nominal
• 23 Existence of diphasic derivation of R wave, nominal
• 24 Existence of ragged P wave, nominal
• 25 Existence of diphasic derivation of P wave, nominal
• 26 Existence of ragged T wave, nominal
• 27 Existence of diphasic derivation of T wave, nominal
• Of channel DII:
• 28 .. 39 (similar to 16 .. 27 of channel DI)
• Of channels DIII:
• 40 .. 51
• Of channel AVR:
• 52 .. 63
• Of channel AVL:
• 64 .. 75
• Of channel AVF:
• 76 .. 87
• Of channel V1:
• 88 .. 99
• Of channel V2:
6. • 100 .. 111
• Of channel V3:
• 112 .. 123
• Of channel V4:
• 124 .. 135
• Of channel V5:
• 136 .. 147
• Of channel V6:
• 148 .. 159
• Of channel DI:
• Amplitude , * 0.1 milivolt, of
• 160 JJ wave, linear
• 161 Q wave, linear
• 162 R wave, linear
• 163 S wave, linear
• 164 R' wave, linear
• 165 S' wave, linear
• 166 P wave, linear
• 167 T wave, linear
•
• 168 QRSA , Sum of areas of all segments
divided by 10, ( Area= width * height / 2 ),
linear
• 169 QRSTA = QRSA + 0.5 * width of T wave *
0.1 * height of T wave. (If T is diphasic then the
bigger segment is considered), linear
• Of channel DII:
• 170 .. 179
• Of channel DIII:
• 180 .. 189
• Of channel AVR:
• 190 .. 199
• Of channel AVL:
• 200 .. 209
• Of channel AVF:
• 210 .. 219
• Of channel V1:
• 220 .. 229
• Of channel V2:
• 230 .. 239
• Of channel V3:
• 240 .. 249
• Of channel V4:
• 250 .. 259
• Of channel V5:
• 260 .. 269
• Of channel V6:
• 270 .. 279
7. NO. OF HEART BEAT IN NORMAL PERSON AND THE ONE
WHO HAS CARDIAC ARRHYTHMIA
• The normal rhythm of the heart is characterized by a consistent and regular heartbeat. In a healthy individual,
the heart typically beats at a rate between 60 to 100 beats per minute (bpm) at rest.
• This regularity is maintained by the synchronized electrical impulses that coordinate the contraction and
relaxation of the heart's chambers, ensuring an efficient and continuous flow of blood throughout the body.
• The reliability of this rhythmic pattern is essential for optimal cardiovascular function, allowing the heart to
fulfill its vital role in sustaining life.
• On the contrary, individuals with cardiac arrhythmias experience disruptions in the normal heartbeat pattern.
The heart may beat too fast (tachycardia), too slow (bradycardia), or irregularly.
• The specific impact on the heart rate depends on the type and severity of the arrhythmia. For instance, atrial
fibrillation, a common type of arrhythmia, is characterized by rapid and irregular heartbeats, potentially
exceeding 100 bpm.
• In contrast, bradycardias may lead to heart rates below the normal range, compromising the heart's ability to
pump blood effectively.
• The variation in heart rates among individuals with cardiac arrhythmias underscores the diverse nature of
these conditions and highlights the importance of accurate detection and classification for effective medical
management.
8.
9.
10.
11. S.no Title Year Author Methodology Observation
1
Smart-IoT Business Process
Management: A Case Study on
Remote Digital Early Cardiac
Arrhythmia Detection and
Diagnosis
https://ieeexplore.ieee.org/docum
ent/10107739
2023
Patricia Gómez-
Valiente; Jennifer
Pérez Benedí
Case study,
Smart-IoT
integration
Efficient early detection and
diagnosis of cardiac arrhythmias
remotely
2
Designing Very Fast and Accurate
Convolutional Neural Networks
With Application in ICD and Smart
Electrocardiograph Devices
https://ieeexplore.ieee.org/docum
ent/10015745
2023
Alireza Keyanfar;
Reza Ghaderi;
Soheila Nazari
CNN design, ICD,
Smart
Electrocardiogra
ph
Developed high-speed and
accurate CNNs for ICD and
smart ECG devices
12. S.no Title Year Author Methodology Observation
3
A Novel Hybrid Model Based on
Convolutional Neural Network With
Particle Swarm Optimization Algorithm
for Classification of Cardiac Arrhythmias
https://ieeexplore.ieee.org/document/1
0143187
2023
Fredy Santander
Baños; Norberto
Hernández Romero
Hybrid CNN-PSO
model
Improved classification accuracy in
cardiac arrhythmias using hybrid
model
4
Heartbeat Dynamics: A Novel Efficient
Interpretable Feature for Arrhythmias
Classification
https://ieeexplore.ieee.org/document/1
0217821
2023
Xunde Dong; Wenjie
Si
Feature extraction,
interpretable
dynamics
Discovered efficient heartbeat
dynamics as an interpretable feature
for arrhythmia classification
13. S.no Title Year Author Methodology Observation
5
Cardiac Adipose Tissue
Segmentation via Image-Level
Annotations
https://ieeexplore.ieee.org/docum
ent/10093956
2023
Ziyi Huang; Yu
Gan; Theresa Lye;
Yanchen Liu
Image
segmentation,
annotation-
based approach
Successful segmentation of
cardiac adipose tissue using
image-level annotations
6
Feature Selection Using Selective
Opposition Based Artificial Rabbits
Optimization for Arrhythmia
Classification on Internet of Medical
Things Environment
https://ieeexplore.ieee.org/document/1
0242063
2023
G. S. Nijaguna; N.
Dayananda Lal;
Parameshachari
Bidare Divakarachari
Feature selection,
artificial rabbits
optimization
Improved arrhythmia classification
in IoMT environment through
selective opposition-
14. S.no Title Year Author Methodology Observation
7
Automatic Cardiac Arrhythmia
Classification Using Residual Network
Combined With Long Short-Term
Memory
https://ieeexplore.ieee.org/document/9
794445
2022
Yun Kwan Kim; Minji
Lee; Hee Seok Song
Residual Network,
LSTM
Enhanced automatic arrhythmia
classification using a combination of
ResNet and LSTM
8
MLBF-Net: A Multi-Lead-Branch Fusion
Network for Multi-Class Arrhythmia
Classification Using 12-Lead ECG
https://ieeexplore.ieee.org/document/9
373359
2021
Jing Zhang; Deng
Liang; Aiping Liu
Multi-Lead-Branch
Fusion Network
Efficient multi-class arrhythmia
classification with MLBF-Net using
12-Lead ECG data
15. S.no Title Year Author Methodology Observation
9
Evaluation of Level-Crossing ADCs for
Event-Driven ECG Classification
https://ieeexplore.ieee.org/document/9
655502
2021
Maryam Saeed;
Qingyuan Wang;
Olev Märtens
Evaluation of Level-
Crossing ADCs
Investigated the suitability of Level-
Crossing ADCs for event-driven ECG
classification
10
SRECG: ECG Signal Super-Resolution
Framework for Portable/Wearable
Devices in Cardiac Arrhythmias
Classification
https://ieeexplore.ieee.org/document/1
0018428
2023
Tsai-Min Chen; Yuan-
Hong Tsai; Huan-
Hsin Tseng
Signal super-
resolution,
portable/wearable
devices
Proposed SRECG for enhanced ECG
signal quality on portable/wearable
devices for arrhythmia classification
16. S.no Title Year Author Methodology Observation
11
Manifold Approximating Graph
Interpolation of Cardiac Local Activation
Time
https://ieeexplore.ieee.org/document/9
755048
2022
Jennifer Hellar;
Romain Cosentino;
Mathews M. John
Manifold
Approximating
Graph Interpolation
Accurate interpolation of cardiac
local activation time using manifold
approximation
12
Three-Heartbeat Multilead ECG
Recognition Method for Arrhythmia
Classification
https://ieeexplore.ieee.org/document/9
762323
2022
Liang-Hung Wang;
Yan-Ting Yu; Wei Liu;
Lu Xu
Multilead ECG
Recognition
Efficient arrhythmia classification
with a recognition method based on
three heartbeats in multilead ECG
17. S.no Title Year Author Methodology Observation
13
Discrimination of Cardiac Abnormalities
Based on Multifractal Analysis in
Reservoir Computing Framework
https://ieeexplore.ieee.org/document/1
0317876
2023
Basab Bijoy
Purkayastha; Shovan
Barma
Multifractal
Analysis, Reservoir
Computing
Successful discrimination of cardiac
abnormalities using multifractal
analysis in a reservoir computing
framework
14
A Takagi-Sugeno Fuzzy-Model-Based
Tracking Framework to Regulate Heart
Rhythm Dynamics
https://ieeexplore.ieee.org/document/1
0122546
2023
Jairo Moreno-Sáenz;
Ying-Jen Chen;
Kazuo Tanaka
Takagi-Sugeno
Fuzzy-Model, Heart
Rhythm Dynamics
Developed a tracking framework
based on Takagi-Sugeno fuzzy
models to regulate heart rhythm
dynamics
18. S.no Title Year Author Methodology Observation
15
Multimodal Neural Network for
Recognition of Cardiac Arrhythmias
Based on 12-Lead Electrocardiogram
Signals
https://ieeexplore.ieee.org/document/1
0323401
2023
Mariya R. Kiladze;
Ulyana A. Lyakhova;
Pavel A. Lyakhov
Multimodal Neural
Network
Successful recognition of cardiac
arrhythmias using a multimodal
neural network with 12-Lead ECG
signals
19. EXISTING SYSTEM
• The existing system for cardiac arrhythmia detection primarily relies on traditional electrocardiogram (ECG) monitoring methods,
which involve the use of electrodes placed on the patient's skin to record electrical activity.
• While effective, this approach has limitations in terms of continuous monitoring and real-time detection. Additionally, conventional
ECG devices are not integrated with IoT sensors, which restricts the ability to capture additional vital parameters such as age, sex,
height, and weight.
• This leads to a partial view of the patient's physiological state. Moreover, the data preprocessing techniques employed in the existing
system may not fully harness the potential of advanced machine learning models. As a result, there is room for improvement in
terms of accuracy and efficiency in arrhythmia detection.
• This underscores the need for a more comprehensive and real-time solution that integrates IoT sensors for enhanced data collection
and employs advanced machine learning techniques for accurate classification of cardiac arrhythmias.
20. DRAWBACKS
The current system lacks the capability to seamlessly accommodate an increasing volume of patient data,
potentially leading to performance bottlenecks as the dataset grows.
The reliance on a stable internet connection for real-time data transmission from IoT sensors introduces a
vulnerability, as disruptions in connectivity may hinder timely arrhythmia detection.
21. PROPOSED SYSTEM
• The proposed system for this project encompasses a comprehensive framework for real-time cardiac arrhythmia detection utilizing IoT sensors. These sensors will be strategically
deployed to capture essential physiological parameters, including age, sex, height, weight, QRS duration, QT interval, and T wave morphology.
• The collected data will undergo a rigorous preprocessing pipeline to ensure its quality and suitability for machine learning analysis. Subsequently, a state-of-the-art multi-class
classification model will be developed, leveraging advanced algorithms and deep learning architectures to accurately classify arrhythmias into thirteen distinct categories.
• This model will be trained and fine-tuned using the preprocessed data, allowing it to learn intricate patterns and correlations associated with different arrhythmic conditions.
• The system's real-time capability, coupled with its ability to process data from diverse sources, holds the potential to revolutionize cardiac health monitoring, providing timely and
accurate insights for early intervention and personalized treatment strategies.
• Furthermore, the proposed system will be rigorously evaluated using established performance metrics to ensure its reliability and effectiveness in clinical settings, ultimately
contributing to improved patient outcomes in the realm of cardiac care
22. Problem Statement:
• Cardiac arrhythmias pose a significant health risk, with the potential to
lead to severe complications such as strokes and heart failure. Timely
and accurate detection of these irregular heart rhythms is crucial for
effective clinical intervention.
• The complexity of arrhythmia patterns and the vast amount of data
associated with cardiac health present challenges in developing precise
and efficient diagnostic tools.
• Traditional methods often fall short in providing real-time, automated,
and reliable identification of arrhythmias, leading to delays in
treatment and an increased burden on healthcare systems.
23. Solution to the Problem Statement
• This project addresses the challenge of cardiac arrhythmia detection through the application of
advanced machine learning techniques. By leveraging a dataset comprising 278 features from 1356
patient samples, the project employs the random forest algorithm for feature selection, identifying
the key determinants of arrhythmias.
• The subsequent implementation of classification algorithms, including the weighted k-nearest
neighbors (kNN) algorithm, on the reduced feature set aims to achieve accurate and efficient
categorization of different types of arrhythmias.
• The integration of machine learning into the diagnostic process not only streamlines analysis but
also holds the potential to provide faster and more reliable results.
• The project's solution seeks to contribute to the enhancement of cardiac healthcare by offering a
robust and automated method for arrhythmia detection.
• The goal is to empower healthcare professionals with a tool that can assist in early and accurate
diagnosis, facilitating prompt and targeted interventions.
• By marrying technological advancements with medical expertise, this project aspires to bridge the
gap between traditional diagnostics and the evolving landscape of digital health, ultimately
improving patient outcomes in the realm of cardiac care.
25. HARDWARE AND SOFTWARE REQUIREMENTS
SOFTWARE REQUIREMENTS
● Anaconda navigator Jupyter Notebook, Python 3.7
● Python 3.3 or 2.7 or higher
● Libraries: Matplotlib,Seaborn Numpy Pandas Keras Tensorflow Pillow,
● SK Learn Open CV OS
HARDWARE REQUIREMENTS
● Processor: 64-bit 2.8GHz 8.00 GT/s, i3/i5/i7
● Laptop or PC
● Web Camera, Mobile Camera
● RAM: 8GB or higher
● Operating System: Windows 8 or newer, 64-bit mac OS 10.13+, or Linux, including Ubuntu,
Red Hat, Cent OS 6+ and other
27. Importing Libraries
‘numpy’ is a Python module for scientific computing. This library will be
utilised throughout the project and is imported as 'np'.
‘pandas’ is used to manipulate and analyse data. pandas is a BSD-licensed
open source library with basic data structures and data analysis skills as pd,
the pandas package is imported.
matplotlib.pyplot Matplotlib offers a collection of command-style
functions that allow it to operate similarly to MATLAB. It has the form of
plt.
‘seaborn’ is a Python data visualisation package for appealing and useful
statistical visuals based on matplotlib.
28. Data Pre-processing
Pre-processing is the term for the adjustments we make to our data before sending it to the
algorithm, as seen in figure 5. 1. Data Preprocessing is a method for transforming messy data into a tidy
collection.
To put it another way, when data is collected from several sources, it is done so in a raw form that
prevents analysis.
Performing a NaN check
Checking for NaN is critical during data pre-processing. We were only able to find a few NaNs in this
try.
Changing the value of NaN
It's critical to get rid of the NaN values. This may be accomplished by:
removing the whole column having a large number of NaN values
Method of forward fillna
Method of backward fillna
Using the mean technique
29. Data analysis
• Data analysis is the process of dissecting, sanitising, modifying, and modelling data with the aim of revealing
relevant information, guiding deductions, and assisting in decision- making.
• Data analysis has many different components and steps, including a wide variety of methods with
different names that are applied in a number of business, scientific, and social science fields. Because it
helps businesses to operate more efficiently and make more scientific judgments, data analysis is essential
in today's business environment.
30. Feature extraction
• Feature extraction is the process of converting raw data into
numerical traits that may be used while keeping the specifics of the
original data set. Compared to just applying machine learning to raw
data, it produces superior outcomes.
• As a consequence, when training a dataset, it is possible to quantify
how much each feature lowers impurity. The greater an attribute's
ability to eliminate impurity, the more significant
• it is. In random forests, the impurity decrease from each feature may
be averaged across datasets to determine the variable's final
significance.
31. PREDICTION AND ACCURACY
• Stated machine learning algorithms are taught to forecast the customer's smart phone decision. The ability to
forecast the customer's choice of smart phone is critical in helping smart phone makers improve their
standards by observing what characteristics are important to customers when choosing a smart phone. Simply
put, accuracy refers to how well your machine learning model predicts the proper class for a given
observation.
TRAIN AND TEST DATASET
• It's time to fit the first machine learning model into your data once you've cleaned it up, visualised
it, and learnt more about it. Creating two sets of data: one for training and one for testing.
• Training Dataset: A portion of the data was used to fit the model.
• The test dataset is used to objectively assess the final model's fit to the training dataset.
32. ALGORITHM
Random Forest Algorithm
The random Forest algorithm is used for classification. In this project random forest is used to get the principal attributes. In the given dataset
there might be many repeated data or the data might not have the correct values therefore all the data should be removed. So, using the random
forest with pca we will remove the unwanted values and we will create a new dataset called reduced features where all the dataset can be stored.
K- Nearest Neighbour
Here the KNN is used to find out the accuracy for the given dataset. First the KNN library is imported then the dataset is divided into test and
train. Then using the accuracy score the accuracy is measured. Finally, the accuracy measured is 52.09.
SVM classifier
Here the SVM is used to find out the accuracy for the given dataset. First the SVM library is imported then the dataset is divided into test and
train. Then using the accuracy score the accuracy is measured. Finally, the accuracy measured is 96.67.
33. ALGORITHM (CONTD…)
Logistic Regression
Here the Logistic Regression is used to find out the accuracy for the given dataset. First the Logistic Regression library is imported then the
dataset is divided into test and train. Then using the accuracy score the accuracy is measured. Finally, the accuracy measured is 54.61.
Naïve Bayes
Here the Naïve Bayes is used to find out the accuracy for the given dataset. First the Naïve Bayes library is imported then the dataset is divided
into test and train. Then using the accuracy score the accuracy is measured. Finally, the accuracy measured is 54.61.
Weighted KNN
Here the Weighted KNN is used to find out the accuracy for the given dataset. First the Weighted KNN library is imported then the dataset is
divided into test and train. Then using the accuracy score the accuracy is measured. Finally, the accuracy measured is 97.78.
Decision Tree
Here the Decision Tree is used to find out the accuracy for the given dataset. First the Decision Tree library is imported then the dataset is
divided into test and train. Then using the accuracy score the accuracy is measured. Finally, the accuracy measured is 97.78.