Using Alarm Clustering to Minimise Intensive Care Unit False Alarms
1. i
Using Alarm Clustering to Minimise Intensive Care Unit False Alarms
GearΓ³id Lennon
University College Cork
Dissertation submitted to the Department of Computer Science,
University College Cork, in partial fulfilment of the requirements for the
award of Master of Science Degree in Data Science and Analytics
September 2015
2. FALSE ALARM MINIMISATION II
Declaration
Candidateβs Declaration
I hereby declare that this dissertation is the result of my own original
work and that no part of it has been presented for another degree in this
University or elsewhere.
Candidateβs Signature: β¦β¦β¦β¦β¦β¦β¦β¦β¦β¦..β¦β¦β¦β¦ Date: β¦β¦β¦β¦β¦
Name: β¦β¦β¦β¦β¦β¦β¦β¦β¦β¦β¦β¦β¦β¦β¦β¦β¦β¦β¦β¦β¦β¦β¦β¦β¦β¦β¦β¦..
Supervisorβs Declaration
I hereby declare that the preparation and presentation of the
dissertation were supervised in accordance with the guidelines on supervision
of dissertations laid down by University College Cork.
Supervisorβs Signature: β¦β¦β¦β¦β¦β¦β¦β¦β¦β¦β¦β¦ Date: β¦β¦β¦β¦β¦β¦β¦
Name: β¦β¦β¦β¦β¦β¦β¦β¦β¦β¦β¦β¦β¦β¦β¦β¦β¦β¦β¦β¦β¦β¦β¦β¦β¦β¦β¦β¦..
3. FALSE ALARM MINIMISATION III
Abstract
Electrocardiogram machines in the Intensive Care Unit of hospitals emit
alarms upon recording certain events and abnormalities in the heart rate and
blood pressure readings. Studies show that up to 86% of alarms produced by
these machines are false alarms. These false alarms can lead to stress among
patients and desensitization among hospital staff, resulting in possible adverse
medical outcomes and slower response times.
Alarm mining techniques can be used to reduce the rate of false alarms and
provide a more accurate reading from monitors. Classification models such as
Support Vector Machines have been trained to learn a subset of the
Multiparameter Intelligent Monitoring in Intensive Care II database and have
been shown to reduce the level of false alarms by 71.73%, but these models
also falsely identify 7.91% of true alarms as false.
The Multiparameter Intelligent Monitoring in Intensive Care II database
available from physioNet.org provides waveform data accompanied by
clinically classified alarm annotations and numeric data. The waveform data
includes Electrocardiogram waveforms, Arterial Blood Pressure waveforms
and Pulmonary Arterial Pressure Waveforms. Arrhythmia alarm type is
included in the data.
I used both alarm classification techniques and alarm clustering techniques to
reduce the false alarm rate in the dataset.
The clustering techniques reduced false alarms by 9.17% with no true alarms
being falsely identified as false positive.
4. FALSE ALARM MINIMISATION IV
This shows that clustering techniques are not as effective as classification
techniques; however in an environment where false negatives can be fatal,
they are safer.
5. FALSE ALARM MINIMISATION V
Acknowledgements
I would like to thank Professor Gregory Provan for securing quality
data and providing supervision and support throughout the process of this
project. I would also like to thank my mother for proof-reading my paper and
supporting me throughout this project.
6. FALSE ALARM MINIMISATION VI
TABLE OF CONTENTS
Declaration.........................................................................................ii
Abstract.............................................................................................iii
Acknowledgements ............................................................................ v
Tables................................................................................................ ix
Figures ............................................................................................... x
Using Alarm Clustering to Minimise Intensive Care Unit False Alarms
...................................................................................................................... 1
CHAPTER 1: INTRODUCTION ....................................................... 1
Background .................................................................................... 1
Statement of Thesis......................................................................... 2
Significance / Relevance................................................................. 2
Hypotheses ..................................................................................... 3
Null Hypotheses. ........................................................................ 3
Alternative/Research Hypotheses................................................ 3
Objectives....................................................................................... 4
Organisation ................................................................................... 4
Literature.................................................................................... 4
Formulation of Thesis. ................................................................ 4
Extraction of data........................................................................ 5
Data pre-processing. ................................................................... 5
Data cleaning.............................................................................. 5
7. FALSE ALARM MINIMISATION VII
Discretisation and standardisation of data.................................... 6
Feature selection. ........................................................................ 6
Classification algorithm. ............................................................. 6
Clustering. .................................................................................. 6
Comparison of results and writing up of paper. ........................... 6
Scope of the study .......................................................................... 6
Limitations of the study .................................................................. 7
CHAPTER 2: LITERATURE REVIEW ............................................. 8
False-alarm reduction in monitoring systems .................................. 8
Alarm clustering ........................................................................... 12
Alarm classification...................................................................... 15
CHAPTER 3: METHODOLOGY..................................................... 18
Data.............................................................................................. 18
Data cleaning................................................................................ 30
Research Instrument ..................................................................... 36
Discretisation............................................................................ 37
Normalisation ........................................................................... 38
Cross-validation........................................................................ 38
Feature selection....................................................................... 39
Classification model ................................................................. 39
K-means clustering ................................................................... 41
9. FALSE ALARM MINIMISATION IX
Tables
Table 1. Suppression rates of related work on MIMIC and MIMIC II........... 11
Table 2. Summary of the dataset................................................................... 36
Table 3. Results of Support Vector Machine model using scaled and
continuous data............................................................................................ 45
Table 4. Results of Support Vector Machine model using discrete data and all
features ........................................................................................................ 45
Table 5. Results of k-means clustering model............................................... 46
Table 6. Results of k-medoids clustering model............................................ 46
10. FALSE ALARM MINIMISATION X
Figures
Figure 1. ECG rhythm strip indication Asystole (Caggiano, 2015)................ 19
Figure 2. ECG strip indication Bradycardia (Resuscitation Cuncil (UK), 2015)
.................................................................................................................... 19
Figure 3. ECG strip indicating Extreme Tachycardia (Chung, 2015)............. 19
Figure 4. ECG of Ventricular Tachycardia (Compton, 2015)......................... 20
Figure 5. ECG of Ventricular Fibrillation (Medical Training and Simulation
LLC, 2015) .................................................................................................. 20
Figure 6. Representation of a typical alarm scope......................................... 23
Figure 7. Sample twenty second scope of Extreme Tachycardia alarm, created
using WAVE................................................................................................. 24
Figure 8. The first 0.44 seconds of the twenty second scope outlined in Figure
7................................................................................................................... 25
Figure 9. Scatter plot of Age and Mean ABP Height (non-null values).......... 32
Figure 10. Scatter plot of Age and Mean PAP (non-null values).................... 33
Figure 11. Sample discretisation................................................................... 37
Figure 12. Linear Support Vector Machine classifier .................................... 40
Figure 13. Visual representation of results .................................................... 47
Figure 14. ROC Curve for k-medoid model.................................................. 47
11. 1
Using Alarm Clustering to Minimise Intensive Care Unit False Alarms
CHAPTER 1: INTRODUCTION
Background
Frequently, patients in the Intensive Care Unit of hospitals are
connected to bedside monitors while suffering or recovering from various
ailments. These monitors take constant records of critical patient data
including respiratory status, heart rate, blood pressure readings and body
temperature. They serve the important role of providing physicians and other
health care professionals with vital signs in real time.
Many of these monitors also save data that can be analysed later to
provide physicians with extra in-depth patient progress and to help provide
essential data for medical research.
Various thresholds are saved in these machines that when exceeded
cause the machine to signal an alarm.
The alarms I focused on in this project were Arrhythmia Alarms.
Arrhythmia alarms are activated when there is a problem with the rate or
rhythm of the heartbeat (NIH, 2011). I focussed solely on life-threatening
arrhythmia alarms, also known as βred-alarmsβ.
When these red-alarms are activated in hospitals they produce loud,
high pitched noises that can be distressing for both patients and staff. This
cause adverse health effects for patients and added stress for staff. Therefore,
ensuring these are activated only when necessary is important for the well-
being of all persons in the Intensive Care Unit.
A study carried out in a paediatric Intensive Care Unit (Lawless, 1994)
found that over 94% of alarm soundings may not be clinically important. A
12. FALSE ALARM MINIMISATION 2
further study carried out in a twelve-bed medical intensive care unit (Siebig, et
al., 2010) found that 83% of alarms were irrelevant. These high false alarm
rates desensitise staff with the result of, what Lawless calls, the βcrying wolfβ
effect. Staff are less likely to react with the same haste to alarms when the vast
majority are false, and sometimes nurses even inappropriately disable these
alarms (Sowan, Tarriela, Gomez, Reed, & Rapp, 2015).
Statement of Thesis
The data mining technique of clustering can reduce this serious rate of
false positive Intensive Care Unit alarms without suppressing any true alarms,
therefore not introducing any further risk to the Intensive Care Unit.
Significance / Relevance
It is hoped that the conclusion of this study will guide and improve
bedside patient monitors in the ICU. It is clear that real-time data analytics is
one effective way to do this. However we must be careful with any
improvements that we introduce to these machines. Right now these machines
seem to err on the side of caution, applying a βbetter safe than sorryβ policy
with alarm emissions. Of course this is a sensible policy as it means that no
potential life threatening incident is ignored by the monitor. However the
added problems that are brought about by frequent false alarms need to be
addressed and the very high insignificance rate has not been greatly reduced in
the last twenty one years. The rate was 94% in 1994 and in a similar study the
rate was 83% in 2010. Continued improvements are needed if these monitors
are to be trusted by health care professionals. This extremely important field of
medicine is constantly evolving and this is one way we, in data analytics, can
help bring about important changes.
13. FALSE ALARM MINIMISATION 3
Previously, various studies have been very accurate in classifying
alarms using data mining techniques, but there has still been a true alarm
suppression rate in these studies which effectively renders them impractical. A
false negative alarm could potentially be fatal.
In this study I aimed to achieve a 0% false negative rate in my
classification of alarms, even at the cost of lower accuracy identifying false
positives.
Hypotheses
Null Hypotheses.
ο· k-means clustering cannot significantly reduce the rate of false
positive arrhythmia alarms without any false negative alarms.
ο· k-medoids clustering cannot significantly reduce the rate of
false positive arrhythmia alarms without any false negative
alarms.
ο· k-medoids clustering has an equal false positive reduction rate
with k-means clustering.
Alternative/Research Hypotheses.
ο· k-means clustering can significantly reduce the rate of false
positive arrhythmia alarms without any false negative alarms.
ο· k-medoids clustering can significantly reduce the rate of false
positive arrhythmia alarms without any false negative alarms.
ο· k-medoids clustering has a higher false positive reduction rate
than k-means clustering.
14. FALSE ALARM MINIMISATION 4
Objectives
ο· Create a useable dataset from the MIMIC II database
containing summary statistics related to each alarm.
ο· Apply multiple linear regression to the data features to fill any
missing values.
ο· Apply a Support Vector Machine classification algorithm to this
data and obtain false suppression and true suppression rates as a
reference.
ο· Apply a k-means clustering algorithm to the same data and
obtain false suppression rates and true suppression rates.
ο· Apply a k-medoids clustering algorithm to the same data and
obtain false suppression rates and true suppression rates.
Organisation
This research project involved many stages before I arrived at the final
paper. I will briefly describe the organisation in this section but will elaborate
on each stage in more detail in the ensuing chapters.
Literature.
I initially started the project by reading about the field of alarm mining.
This involved reading about system based intrusion detection systems, house
alarms and Intensive Care Unit alarms. I learnt about the methods being used
in the field in all three of the areas.
Formulation of Thesis.
I considered many projects including various research questions related
to reducing false alarms in home intrusion detection systems using both
classification techniques and clustering algorithms. The availability of the
15. FALSE ALARM MINIMISATION 5
MIMIC II database and the attraction of medical research drew me towards
my ultimate research question. I observed the previous research in the area and
identified a field that had not been explored surrounding clustering algorithms.
I knew I could design a clustering algorithm that could minimise if not
eliminate false negative alarms in the ICU and reduce false positives.
Therefore, I arrived at my statement of thesis.
Extraction of data.
As I was using a Windows machine I used the Cygwin dynamic-link
library to simulate a Linux environment in order to use the WFDB Software
Package required to access, view and download waveform data from
PhysioNet research resource. I used Python to facilitate the extraction of
thousands of alarm scopes from the MIMIC II database which heavily relied
on the Anaconda Python distribution package which contains useful modules
like pandas and numpy (used to store arrays, time series and data frames
which can be easily manipulated), matplotlib (for visualisation) and scikit-
learn (for in-built machine learning algorithms).
Data pre-processing.
As the data from the waveform records was very voluminous I needed
to create summary statistics from each waveform. This was done as the data
was extracted from online so that a considerable amount of disk space wasnβt
being occupied. This then resulted in a single comma separated values (.csv)
file.
Data cleaning.
To clean the data I used a combination of Python and R statistical
software.
16. FALSE ALARM MINIMISATION 6
Discretisation and standardisation of data.
I created two separate datasets from the resulting data:
1. A discretised dataset: All the continuous features were
converted to discrete features
2. Standardised dataset: All continuous features remained
continuous but were normalised.
Feature selection.
The number of features was reduced to avoid the βcurse of
dimensionalityβ.
Classification algorithm.
I applied a Support Vector Machine algorithm to the data using the sci-
kit learn package in Python and obtained results (using cross validation) as a
reference.
Clustering.
I applied k-means and k-medoids clustering, finding the optimal value
for k (i.e. optimal number of clusters) and using cross validation obtained a
result.
Comparison of results and writing up of paper.
I then compared the false alarm and true alarm suppression rates of
Support Vector Machines, k-means clustering and k-medoids clustering and
arrived at my conclusions. Finally, I compiled the research paper.
Scope of the study
This study focusses solely on arrhythmia alarms, which relate to the
heart, a very significant area of ICU care. However alarms related to
respiratory problems or other medical alarms that sound in an Intensive Care
17. FALSE ALARM MINIMISATION 7
Unit are not addressed. Given reliably annotated data, a similar study could be
conducted in the fields related to the various other alarm types in the Intensive
Care Unit, and indeed other units of a hospital. Data mining techniques are not
the only techniques being explored to reduce false alarm rates in hospitals and
research is being conducted by the medical community and the data analytics
community (Chambrini, 2001). This study particularly addresses clustering
algorithms and how they can help alleviate the problem.
Limitations of the study
I came across some limitations while carrying out this study. The first
was that I was forced to complete my project using version 2 of the MIMIC II
database. There is a third version of the database, which is four times larger
than version 2, and is of better quality (Massachusetts Institute of Technology,
2015). Version 2 dates back to 2010, while version 3 was updated in 2012.
However version 3 did not contain the annotated arrhythmia alarms that I
needed to complete my study, so I chose version 2.
The waveform signals often had gaps in them, resulting in the file
often indicating that the signal was present but no values were indicated in the
time series. In this case the values were presented as β-β.
18. FALSE ALARM MINIMISATION 8
CHAPTER 2: LITERATURE REVIEW
False-alarm reduction in monitoring systems
A survey (Hubballi & Suryanarayanan, 2014) was carried out on the
techniques used to minimize false alarms in network-based intrusion detection
systems.
Intrusion detection systems (IDS) are an essential component of
a complete defense-in-depth architecture for computer network
security. IDS collect and inspect audit data looking for evidence of
intrusive behaviors [sic]. As soon as an intrusive event is detected, an
alarm is raised giving the network administrator the opportunity to
promptly react.
(Perdisci, Giacinto, & Roli, 2006).
The techniques investigated in this survey are applicable to other false
alarm minimisation areas, including home-based intrusion detection systems,
and medical alarm systems. The techniques investigated included the alarm
mining techniques of:
ο· Alarm clustering
ο· Alarm classification
One of the questions they asked the research community was:
βEvaluation on a common dataset: We ο¬nd most of the works evaluate their
techniques on a local custom dataset. The performance of system analysed in
terms of false positives reduction ratio on a common dataset will help
understand the usefulness.β (Hubballi & Suryanarayanan, 2014, p. 15).
My study will evaluate alarm mining techniques on a common dataset
to understand the usefulness of each technique.
19. FALSE ALARM MINIMISATION 9
A massive freely available database for this project is the MIMIC II
database, described by (Saeed, Lieu, Raber, & Mark, 2002). MIMIC stands for
Multi-parameter Intelligent monitoring for Intensive Care. It is being used for
many purposes, including decision-support systems, intelligent patient
monitoring research, medical data mining and knowledge discovery.
Consequently false alarm minimisation is being tackled using these
techniques. MIMIC II contains records of βICU patients admitted to an 8-bed
Medical Intensive Care Unit (MICU) and an 8-bed Coronary Care Unit
(CCU)β¦ Each record consisted of four continuously monitored waveforms (2
Leads of ECG, Arterial Blood Pressure, and Pulmonary Artery Pressure)
sampled at 125 Hz, 30 1-minute parameters (HR, BP, SpO2, Cardiac Output),
and monitor-generated alarms.β (Saeed, Lieu, Raber, & Mark, 2002)
The gold standard in alarm classification (Aboukhalil, Nielsen, Saeed,
Mark, & Clifford, 2008) was produced for the MIMIC II waveform database,
by classifying each alarm as true or false. They used eleven volunteers to
review the alarms consisting of an experienced physician, four experienced
signal processing experts and six graduate students with training in cardiac
electrophysiology. Two annotators worked together to mark each alarm as true
or false, and only alarms in which they could agree on were included in the
βgold standardβ set. This gold standard set is now available with the MIMIC II
Database, and each alarm in the set is annotated (1 for true, 3 for false).
Aboukhalil et al. (2008) then used an algorithm using morphological and
timing information, derived from the Arterial Blood Pressure signal, to reduce
the false alarm rate. Their false alarm suppression algorithm suppressed an
average of 63.2% of false alarms and an average of 1.4% of true alarms. I
20. FALSE ALARM MINIMISATION 10
would consider this an extremely successful reduction of false positives. There
could be no doubt that a reduction of this scale would make a significant
improvement to the Intensive Care Unit environment and safety. However,
despite being a small figure, 1.4% of true alarms being suppressed is
dangerous. This means that, of the 1470 alarms in the test set, 20 real alarms
were incorrectly identified as false positives, when in fact, they were true
positives. These are twenty real life red-alarms that might have gone unnoticed
if this algorithm was implemented in real-time. So despite the fact that the
false alarm suppression is impressive with this algorithm, it is far more
important to have a 0% false negative rate, even at the cost of a lower false
positive suppression rate due to potential fatality.
Clifford, et al., (2006) looked only at the ABP waveform of the original
MIMIC DB (predecessor of MIMIC II), as they believed it βis perhaps the
least noisy pressure signal commonly availableβ. They achieved good results;
however, they only tested on a gold standard subset of 89 alarms and they did
not use data mining techniques, but rather individual examination of the ABP
Waveform. In other work, (Baumgartner, Rodel, & Knoll, 2012) did use alarm
mining techniques. The classification algorithms used by (Baumgartner,
Rodel, & Knoll, 2012) were:
ο· NaΓ―ve Bayes
ο· Decision Trees
ο· Support Vector Machines
ο· k Nearest Neighbours
ο· Multi-Layer Perceptron
21. FALSE ALARM MINIMISATION 11
They achieved an 84.7% classification accuracy across the complete set using
10-fold cross validation, a 6-bin equal frequency discretisation of continuous
variables and a Support Vector Machine with a radial basis function kernel.
This SVM method achieved the best overall accuracy of all the five
classification methods they tested. The false alarm suppression rate was
71.73% with a true alarm suppression rate of 7.91%.
They also used the same gold-standard as Aboukhalil, et al., (2008).
Table 1. Suppression rates of related work on MIMIC and MIMIC II
Study False Positives False Negatives
Aboukhalil, et al (test set) 63.2% 1.4%
Baumgartner, et al (10-fold cross
validation)
71.73% 7.91%
Clifford, et al. 100% 0%
A 4-way analysis of variance (ANOVA) was applied to the MIMIC II
database by (Hu, et al., 2012). They used the 4-way ANOVA to investigate the
influence of four algorithm parameters on the performance of the data mining
approach. They used 10-fold cross validation and then applied association rule
mining. They focussed on code blue events in hospitals rather than arrhythmia
red alarms. These code blue events relate to cardiopulmonary arrest, which can
be an asystole in extreme cases, but a code blue event is any event in a
hospital in which the patient requires resuscitation. They used a βSuperAlarmβ
set, which is a set of predictive combinations of monitor alarms. This
SuperAlarm set comprised of train data and brand new unannotated patient
data which was used as a test set. They focussed on predicting when an alarm
would activate, more accurately, than bedside monitors. The sensitivity to
alarms was between 66.7% and 90.9% depending on different algorithm
22. FALSE ALARM MINIMISATION 12
parameters. They accepted that more study was needed in this area with
different and more complex algorithms.
The goal in all studies is to maximise the number of false alarms
suppressed while minimising the number of true alarms suppressed.
Realistically a true alarm suppression rate of 0% is needed because the lives of
patients are at risk.
I studied the Hubballi & Suryanarayanan article to gain an insight into
the progress being made in the area of IDSs and how they could be applicable
to false positive arrhythmia alarm reduction. The main areas of interest to me
in this article were alarm clustering and alarm classification. I summarise the
main findings of these articles in the next two sub-sections.
Alarm clustering
The (Hubballi & Suryanarayanan, 2014) study mentions Julisch and
his series of articles (Julisch & Dacier, 2002), (Julisch K. , 2003a), (Julisch
K. , 2003b) and (Julisch K. , 2001). Julisch uses clustering algorithms to group
Intrusion Detection System alarms. He uses βroot cause discoveryβ to learn the
patterns of false alarms in order to identify the root cause of the alarm and
prevent alarms from wrongly being classified as true when in fact they are
false. βThe root cause of an alarmβ¦is the reason for which it is triggeredβ
(Julisch K. , 2003b). Julisch states that in intrusion detection systems a few
dozen root causes account for over 90% of intrusion detection system alarms.
He proposed an alarm clustering method that supports the human analyst to
identify these root causes. His hypothesis is that when alarms are clustered, or
grouped together, all alarms within a cluster share the same root cause.
Unfortunately there is no algorithmic solution to the exact alarm-clustering
23. FALSE ALARM MINIMISATION 13
problem. Instead an approximate alarm-clustering problem can be devised.
Julisch clustered alarms under the assumption that all alarms in a cluster
shared the same root cause. Then, from each cluster a generalised alarm was
derived. The generalised alarm is a feature, or set of features that all alarms in
a cluster shared. This is similar to the idea of the medoid of a cluster. Each
cluster in a k-medoids clustering case has a medoid at its centre, around which
all other alarms in the cluster gather. In a k-medoids clustering algorithm, the
medoid is an existing data point in the dataset, it is not a generalized alarm.
Julisch proposes developing the generalized alarm manually. βClearly, a
generalized alarm like this facilitates the identification of root causes, but
human expertise is still needed. Therefore, alarm clustering only supports root
cause analysis, but does not completely automate itβ (Julisch K. , 2003b) My
aim is for a purely machine learning technique so that manual input is not
required. Machines trigger alarms without human intervention, therefore, I
believe the algorithm that suppresses false alarms should not require manual
human intervention.
Techniques to find the root cause of alarms were improved by (Al-
Mamory & Zhang, 2010) and (Al-Mamory & Zhang, 2009). They took the
generalized alarms to produce filters which can be used to reduce future
alarms load. The algorithm made use of nearest neighbouring to cluster
alarms. I make use of nearest neighbouring in my algorithm also. Al-Mamory
and Zhang reduced all false alarms by 74%, but they do not state whether any
true alarms were filtered out. However they do state: βthe filtering would be
unsafe in case of misclassifying true positives as false positivesβ (Al-Mamory
& Zhang, 2009).
24. FALSE ALARM MINIMISATION 14
Multiple Intrusion Detection Systems were used by (Perdisci, Giacinto,
& Roli, 2006). The IDS worked together to produce unified descriptions of
alarms. Similarly, I study several different signals in this project to gain a
holistic description of the alarms and the events leading to the alarms.
Perdisci, et al. (2006) found clustering effectively summarized the attacks and
drastically reduced the number of alarms. The multiple IDS work together to
identify attacks on the system and provide more comprehensive descriptions
of attacks. They used this information to better formulate an βalarm signatureβ.
An alarm correlation process can transform βelementary alarmsβ from
individual IDS to high-level intrusion reports. Part of this process is alarm
clustering which fuses elementary alarms into a βmeta-alarmβ. In my project,
medoids serve a similar role to meta-alarms. Each alarm was added to the
cluster by comparing it to the clusterβs βmeta-alarmβ with a distance metric.
Rather than reducing false alarms, Perdisci, et al.βs (2006) aim was to
summarize thousands of alarms into meta-alarms. As a result they reduced
alarm volume by between 51.1% and 80.3% over three experiments.
Clustering was further explored in (Dey, 2009). The algorithm that Dey
used was the βIncremental Stream Clustering Algorithmβ. He used this to
reduce the number of false alarms from an IDS output from the DARPA 1999
network traffic dataset. DARPA 1999 is a βstandard corpora for evaluation of
computer network intrusion detection systemsβ (Lincoln Laboratory, MIT,
2015). Dey (2009) tested both the Incremental Stream Clustering Algorithm
and the K-Nearest Neighbour Algorithm and found clustering reduced the
number of alarms by more than 99%, which was better than K-nearest
neighbours which reduced false alarms by 93%. Chitrakar & Chuanhe (2012)
25. FALSE ALARM MINIMISATION 15
used k-medoids clustering with Support Vector Machine classification to
increase the detection rate of Intrusion Detection System. They found that this
method produced better classification performance in terms of accuracy,
detection rate and false alarm rate, when compared with a NaΓ―ve Bayes
classifier.
Although there has been extensive alarm clustering research carried
out in the field of network based Intrusion Detection Systems, clustering has
not been explored in the field of medical alarm reduction.
Alarm classification
Decision trees were explored by (Kim, Shin, & Ryu, 2004). They used
a decision tree as a false alarm classification model on an IDS dataset. It
effectively classified alarms as false alert or true attack. They used statistical
based correlation analysis and associative feature construction for feature
selection. So to choose which features to include in the decision tree model
they computed highly correlated attributes based on associative feature
construction.
Pietraszek (2004) used an Adaptive Learner for Alarm Classification
(ALAC) using background knowledge to reduce false IDS alarms. ALAC
works in conjunction with a human analyst. ALAC learns by observing how
the analyst works manually. From this behaviour it learns which
characteristics define false positives and true positives. As it does this it can
begin to calculate the confidence of false positives based on characteristics.
Alerts that ALAC determines as highly probable to be false positives can be
suppressed automatically. Alerts where the probability is not so high are
passed on to the analyst and ALAC continues to learn from this. This is a
26. FALSE ALARM MINIMISATION 16
complex machine learning approach that may be difficult to implement in
hospitals. This approach was further developed by Pietraszek & Tanner.
Pietraszek and Tanner (Pietraszek & Tanner, 2005) also had the aim of
reducing false positives in intrusion detection. The approach taken by them
was alert post-processing by data mining and machine learning. The data
mining approach is Julischβs (2003b) root cause analysis which Pietraszek and
Tanner call βCLARAty (Clustering Alerts for Root Cause Analysis)β
(Pietraszek & Tanner, 2005). The machine learning approach works hand-in-
hand with the data mining approach by building an alert classifier that tells
true from false positive. Pietraszekβs (2004) ALAC is used as the alert
classifier. Pietraszek and Tanner require a human analyst to verify decisions
made by the ALAC but this can be removed, as confidence in the approach
increases.
Similarly an IDS management technique using a knowledge database
was used by (Manganaris, Christensen, Zerkle, & Hermiz, 2000).
Clustering was used by (Tjhai, Furnell, Papadaki, & Clarke, 2010) as a
means of classification when combined with another classifier. They
developed βa two-stage classification system using a SOM neural network and
K-means algorithm to correlate the related alerts and to further classify the
alerts into classes of true and false alarmsβ (Tjhai, Furnell, Papadaki, &
Clarke, 2010). An SOM is a Self-Organising Map which is a type of neural
network. In this study, yet again, the clustering is undertaken first, which
facilitates the classification of alarms.
Another group (Benferhat, Boudjelida, Tabia, & Drias, 2013) also
correlated alarms by revising probabilistic classifiers using expert knowledge.
27. FALSE ALARM MINIMISATION 17
Again, the expert knowledge is working with the probabilistic classifiers. The
probabilistic classifiers used by (Benferhat, Boudjelida, Tabia, & Drias, 2013)
are βNaive Bayes, Tree Augmented NaΓ―ve Bayes (TAN), Hidden Naive Bayes
(HNB) and decision tree classiο¬ers.β However, their approach is applicable to
other probabilistic classifiers.
K Nearest Neighbours were explored by Law and Kwok (2004). They
reduced false alarms in an IDS by 93%, consistent with (Dey, 2009).
IDS seems like a separate field but has similar goals of false-alarm
minimisation. Usually network based IDSs work together with a human expert
to identify malicious activity. Although medical professionals can be seen as
the equivalent of this human expert, their time spent checking false alarms can
often be better spent attending to other patients in need of assistance.
Different alarm mining techniques learn in different ways and some
techniques can potentially provide different information than others. It is
important that a variety of data mining techniques are tested so that we are not
restricted by just one way of learning the data.
The studies completed in alarm mining in this field are directly
influential in the methodology I use in arrhythmia alarm mining.
28. FALSE ALARM MINIMISATION 18
CHAPTER 3: METHODOLOGY
Data
The data I used for this project was the MIMIC II version 2 database,
freely and publicly available from the PhysioNet website
(www.physionet.org). βPhysioNet offers free web access to large collections of
recorded physiologic signals (PhysioBank) and related open-source software
(PhysioToolkit)β (Massachusetts Institute of Technology, 2015).
PhysioBank is a large and growing archive of well-
characterized digital recordings of physiological signals and related
data for use by the biomedical research community⦠PhysioToolkit is
a library of open-source software for physiological signal processing
and analysis, the detection of physiologically significant events using
both classic techniques and novel methods based on statistical physics
and nonlinear dynamics, the interactive display and characterization of
signals, the creation of new databases, the simulation of physiological
and other signals, the quantitative evaluation and comparison of
analysis methods, and the analysis of nonstationary processes.
PhysioNet is an on-line forum for the dissemination and exchange of
recorded biomedical signals and open-source software for analysing
them.
(Goldberger, et al., 2000)
The Multiparameter Intelligent Monitoring in Intensive Care (MIMIC)
database contains both numeric and waveform data of patients. The subset of
the data I looked at is described by (Aboukhalil, Nielsen, Saeed, Mark, &
29. FALSE ALARM MINIMISATION 19
Clifford, 2008). It focusses on annotated waveforms and numerics related to
βred-alarmsβ which indicate one of the five life-threatening heart problems:
1. Asystole
2. Extreme Bradycardia
3. Extreme Tachycardia
4. Ventricular Tachycardia
5. Ventricular Fibrillation/Tachycardia
βAsystole is cardiac standstill with no cardiac output and no ventricular
depolarizationβ
Figure 1. ECG rhythm strip indication Asystole (Caggiano, 2015)
βA heartbeat that is too slow is called bradycardiaβ (NIH, 2011)
Figure 2. ECG strip indication Bradycardia (Resuscitation Cuncil (UK), 2015)
βA heartbeat that is too fast is called tachycardiaβ (NIH, 2011)
Figure 3. ECG strip indicating Extreme Tachycardia (Chung, 2015)
30. FALSE ALARM MINIMISATION 20
βVentricular tachycardia (VT) refers to any rhythm faster than 100 (or
120) beats/min arising distal to the bundle of His.β
Figure 4. ECG of Ventricular Tachycardia (Compton, 2015)
βVentricular fibrillation (v-fib for short) is the most serious cardiac
rhythm disturbance. The lower chambers quiver and the heart can't pump any
blood, causing cardiac arrest.β (American Heart Association, 2015)
Figure 5. ECG of Ventricular Fibrillation (Medical Training and Simulation LLC, 2015)
The MIMIC II database contains several waveforms but only three
were focussed on in this project:
1. Electrocardiogram (limb lead II of Einthovenβs triangle) known as
βECGβ or βIIβ
2. Arterial Blood Pressure, known as βABPβ
3. Pulmonary Arterial Pressure, known as βPAPβ
31. FALSE ALARM MINIMISATION 21
These were the same three waveforms that were the focus of the study of
Baumgartner et al. (2012), which I used as a key reference for this paper.
Aboukhalil, et al., (2008) identified all records within MIMIC II that had at
least one of the red alarms while also having II and/or ABP records. They then
stored the record names in a file RECORDS.alM, which is available within the
MIMIC II database. There is a more recent version of the MIMIC database
(MIMIC II, version 3) but unfortunately it neither contains the alarm
annotations, nor the file RECORDS.alM, so I opted to use version 2 instead.
The data consists of records collected by heart rate monitors relating to
patients in the ICU. The data is collected over varying lengths of time
depending on the patient. βWaveforms were stored at 125 Hzβ (Aboukhalil,
Nielsen, Saeed, Mark, & Clifford, 2008), meaning for any one second period
of time 125 values were stored for each waveform.
The data is available using the WFDB Software Package, freely available
software specifically designed by PhysioNet.
The data consists of records. Each record focusses on just one patient over a
length of time in which several alarms may have occurred. Related to the
record, are waveform and numeric data.
The numeric data consists of constant measurements taken once every minute.
I took the following features from the numeric files:
1. Sex of patient
2. Age of patient
3. Central Venous Pressure (CVP)
4. Oxygen saturation (SpO2).
32. FALSE ALARM MINIMISATION 22
I then extracted summary statistics from the three waveforms (ECG,
ABP and PAP). I needed to ensure the statistics represented each alarm, and
not each record, because most records contained several alarms. Therefore, I
devised a length of time to represent each alarm, which is known as the alarm
scope. Aboukhalil et al. (2008) provide the typical time delay until an alarm is
activated. For Asystole, Extreme bradycardia and Ventricular
Fibrillation/Tachycardia there is a typical time delay of 5 seconds. Therefore,
for every alarm set off by these arrhythmias, the event causing them happened
five seconds previously. The AAMI (2002) standards require that asystole and
rate-limit arrhythmia alarms be triggered within ten seconds of the onset of the
event, meaning I could only view data 5 seconds after the alarm was activated.
Any subsequent data would be arriving later than 10 seconds after the event
and it would then be too late to emit an alarm as this would be violating AAMI
standards. As (Baumgartner, Rodel, & Knoll, 2012) were used as a reference
point for this project, the same alarm scope was used. βThe signals had to
persist continuously over a time window (alarm scope) of 20 seconds around
the alarm event (15 s before and 5 s after)β (Baumgartner, Rodel, & Knoll,
2012).
33. FALSE ALARM MINIMISATION 23
Figure 6. Representation of a typical alarm scope
Figure 6. gives a visual representation of when the alarm and the event occur
over the course of the scope. It indicates the 10 second window of data that
can be collected after the event in order to comply with AAMI standards. This
scope gets the maximum time allowed but doesnβt violate it. Underneath is a
corresponding sample Extreme Tachycardia alarm scope.
34. FALSE ALARM MINIMISATION 24
Figure 7. Sample twenty second scope of Extreme Tachycardia alarm, created using WAVE.
35. FALSE ALARM MINIMISATION 25
Figure 8. The first 0.44 seconds of the twenty second scope outlined in Figure 7.
36. FALSE ALARM MINIMISATION 26
I generated summary statistics, largely inspired by (Baumgartner,
Rodel, & Knoll, 2012), as they did a similar study with this dataset. The
following are the waveform features extracted. In each case n=2,500, as there
are 2,500 values in each signal scope. In some cases the signal was divided
into cardiac intervals, which are individual heart beats.
1. Mean ECG:
1
π
β π₯π
π
π=0
π₯π = ECG values
2. ECG Dispersion:
β | π₯π β ππππ πΈπΆπΊ|π
π=0
ππππ πΈπΆπΊ
3. Mean ABP:
1
π
β π₯π
π
π=0
π₯π = ABP values
4. ABP Dispersion:
β | π₯π β ππππ π΄π΅π|π
π=0
ππππ π΄π΅π
5. Minimum ABP Height:
min(π₯ π π¦π β π₯ ππππ )
π₯ π π¦π = maximum ABP value in a cardiac interval
π₯ ππππ = minimum ABP value in a cardiac interval
6. Mean ABP Height:
mean(π₯ π π¦π β π₯ ππππ )
π₯ π π¦π = maximum ABP value in a cardiac interval
37. FALSE ALARM MINIMISATION 27
π₯ ππππ = minimum ABP value in a cardiac interval
7. Maximum ABP Height:
max(π₯ π π¦π β π₯ ππππ )
π₯ π π¦π = maximum ABP value in a cardiac interval
π₯ ππππ = minimum ABP value in a cardiac interval
8. Minimum ABP Area (under the curve): Determined using the
trapezoidal rule.
min (β« π( π₯) ππ₯ β
β
2
β(π( π₯ π+1) + π( π₯ π))
π
π=1
π‘(π₯ ππππ )
π‘(π₯ π π¦π )
)
π = the number of values in the ABP cardiac interval
β =
π‘( π₯ ππππ ) β π‘(π₯ π π¦π )
π
9. Mean ABP Area:
mean (β« π( π₯) ππ₯ β
β
2
β(π( π₯ π+1) + π( π₯ π))
π
π=1
π‘(π₯ ππππ )
π‘(π₯ π π¦π )
)
π = the number of values in the ABP cardiac interval
β =
π‘( π₯ ππππ ) β π‘(π₯ π π¦π )
π
10. Maximum ABP Area:
max (β« π( π₯) ππ₯ β
β
2
β(π( π₯ π+1) + π( π₯ π))
π
π=1
π‘(π₯ ππππ )
π‘(π₯ π π¦π )
)
π = the number of values in the ABP cardiac interval
β =
π‘( π₯ ππππ ) β π‘(π₯ π π¦π )
π
11. Mean PAP:
1
π
β π₯π
π
π=0
38. FALSE ALARM MINIMISATION 28
π₯π = PAP values
12. PAP Dispersion:
β | π₯π β ππππ ππ΄π|π
π=0
ππππ ππ΄π
13. Minimum PAP Height:
min(π₯ π π¦π β π₯ ππππ )
π₯ π π¦π = maximum PAP value in a cardiac interval
π₯ ππππ = minimum PAP value in a cardiac interval
14. Mean PAP Height:
mean(π₯ π π¦π β π₯ ππππ )
π₯ π π¦π = maximum PAP value in a cardiac interval
π₯ ππππ = minimum PAP value in a cardiac interval
15. Maximum PAP Height:
max(π₯ π π¦π β π₯ ππππ )
π₯ π π¦π = maximum PAP value in a cardiac interval
π₯ ππππ = minimum PAP value in a cardiac interval
16. Minimum PAP Area (under the curve):
min (β« π( π₯) ππ₯ β
β
2
β(π( π₯ π+1) + π( π₯ π))
π
π=1
π‘(π₯ ππππ )
π‘(π₯ π π¦π )
)
π = the number of values in the PAP cardiac interval
β =
π‘( π₯ ππππ ) β π‘(π₯ π π¦π )
π
17. Mean PAP Area:
mean (β« π( π₯) ππ₯ β
β
2
β(π( π₯ π+1) + π( π₯ π))
π
π=1
π‘(π₯ ππππ )
π‘(π₯ π π¦π )
)
39. FALSE ALARM MINIMISATION 29
π = the number of values in the PAP cardiac interval
β =
π‘( π₯ ππππ ) β π‘(π₯ π π¦π )
π
18. Max PAP Area:
max (β« π( π₯) ππ₯ β
β
2
β(π( π₯ π+1) + π( π₯ π))
π
π=1
π‘(π₯ ππππ )
π‘(π₯ π π¦π )
)
π = the number of values in the PAP cardiac interval
β =
π‘( π₯ ππππ ) β π‘(π₯ π π¦π )
π
19. Minimum Width βThe time difference between diastolic and systolic
values represents the width of the interval.β (Baumgartner, Rodel, &
Knoll, 2012):
min(π‘( π₯ ππππ ) β π‘(π₯ π π¦π ))
20. Mean Width:
mean (π‘( π₯ ππππ ) β π‘(π₯ π π¦π ))
21. Maximum Width:
max (π‘( π₯ ππππ ) β π‘(π₯ π π¦π ))
The WFDB Software Package was used to extract the waveforms and numeric
data required. A program in Python was then used to extract the above
summary statistics from the waveforms and create a CSV file which also
included the numeric data, etc. Along with the above 21 features the following
features were also included in the final CSV file:
22. Alarm ID (Name of record + order of alarm, e.g. The first alarm in
record a40017 was labelled 400171)
23. Alarm Type (One of the five βred alarmβ arrhythmia types)
40. FALSE ALARM MINIMISATION 30
24. Age
25. Sex
26. CVP
27. SpO2
28. Whether the alarm was True or False
In total there were 28 columns of data and 4727 alarms (rows).
Data cleaning
After extracting this data, much of the data was still unclean. There
were many missing data features on many of the alarms. This was due to a
number of factors:
ο· Missing waveforms: Not all records contained a PAP waveform
thus all summary statistics derived from the PAP could not be
extracted for the alarms of these records.
ο· NULL values in the database: Many of the waveforms
contained β-β values instead of numeric values in the time series
and were entered into the dataset as βnanβ
As a result the composition of the dataset was as follows:
Int64Index: 4727 entries, 400171 to 424312
Data columns (total 27 columns):
AlarmType 4726 non-null object
Age 4727 non-null object
Sex 4727 non-null object
CVP 3417 non-null float64
SpO2 4641 non-null float64
ECGMean 4727 non-null float64
ECGDispersion 4707 non-null float64
ABPMean 3673 non-null float64
ABPDispersion 3670 non-null float64
ABPMinHeight 3217 non-null float64
ABPMeanHeight 3214 non-null float64
ABPMaxHeight 3217 non-null float64
ABPMinArea 3214 non-null object
ABPMeanArea 3206 non-null float64
41. FALSE ALARM MINIMISATION 31
ABPMaxArea 3214 non-null object
PAPMean 1503 non-null float64
PAPDispersion 1502 non-null float64
PAPMinHeight 1304 non-null float64
PAPMeanHeight 1302 non-null float64
PAPMaxHeight 1304 non-null float64
PAPMinArea 1304 non-null object
PAPMeanArea 1302 non-null float64
PAPMaxArea 1304 non-null object
MinWidth 3219 non-null float64
MeanWidth 3219 non-null float64
MaxWidth 3219 non-null float64
TrueFalse 4727 non-null float64
With 4727 entries, only 5 columns had no missing values. Cleaning
this set may have given inaccurate results, therefore, I decided to reduce the
size of the data.
The following are the steps I took to reduce the data:
1. I removed the only observation that didnβt have a classification
for the type of arrhythmia (AlarmType).
2. I removed 7 alarms where the age of the patient was marked
β??β, and who didnβt have a PAP signal. These seven alarms
corresponded to one patient.
3. There was one other patient whose age was marked β??β. This
patient had all signals present. This data was too beneficial to
remove. Therefore, I applied a regression model to predict age.
βRegression is used to study the relationship between
quantitative variablesβ (Cronin, 2014). We assume there is a
linear relationship between a set of predictor variables (X) and
a response variable (Y). βAn equation expresses the response as
a linear function of the predictor variables. This equation is
estimated from the data. This model is:
42. FALSE ALARM MINIMISATION 32
ππ = π½0 + π½1 ππ1 + π½2 ππ2 + β― + π½ π πππ + eπ , eπ~ππΌπ·(0, π2
)β
(Cronin, 2014).
The Ξ²s are known as the least square estimates, which minimise
the error in the model. I used the R statistical software to
compute these least square estimates.
Figure 9. Scatter plot of Age and Mean ABP Height (non-null values)
There is a weak positive relationship between age and mean ABP height.
43. FALSE ALARM MINIMISATION 33
Figure 10. Scatter plot of Age and Mean PAP (non-null values)
There is a weak negative relationship between age and mean PAP.
ABP Mean Height and PAP Mean both had the largest Pearsonβs correlation
with age of 0.2268647 and -0.2223904, respectively. Although, individually
their correlations arenβt large I hoped together they could predict the age of the
patient. So I fit a regression line using these variables and got the following
output:
lm(formula = age ~ ABPMeanHeight + PAPMean, data = alarm.df)
Residuals:
Min 1Q Median 3Q Max
-31.662 -7.582 0.421 6.783 33.861
Coefficients:
44. FALSE ALARM MINIMISATION 34
Estimate Std. Error t value Pr(>|t|)
(Intercept) 70.02368 1.11285 62.923 < 2e-16 ***
ABPMeanHeight 0.06650 0.01554 4.280 2.01e-05 ***
PAPMean -0.23716 0.02516 -9.427 < 2e-16 ***
---
Signif. codes: 0 β***β 0.001 β**β 0.01 β*β 0.05 β.β 0.1 β β 1
Residual standard error: 10.28 on 1210 degrees of freedom
(3469 observations deleted due to missingness)
Multiple R-squared: 0.07796, Adjusted R-squared: 0.07644
F-statistic: 51.16 on 2 and 1210 DF, p-value: < 2.2e-16
The R2
value is only 7.796%. All the parameters of this model are
statistically significant (very low p-values). Therefore, I applied the linear
model to the 37 cases and averaged the 37 responses to estimate the age of the
person with whom these alarms are associated.
Once I finished this I noticed, that there was a vast amount of PAP
related summary statistics missing. They each had 1503, or less values
associated with them. As I wanted this signal to be part of my algorithms and
models, I decided to remove all values that didnβt have a Mean PAP value.
This greatly reduced the dataset down to 1503 values.
One value was missing for ECG Dispersion. I removed this case.
There werenβt many missing values after this reduction. I didnβt want to
remove any more values, so I opted to clean any remaining missing values,
without discarding any. Most columns still had a small number of values
missing, so I applied the following linear regression models to these data to
predict the values.
I then applied the following regression models to the missing data (summary
output from these models is available in the Appendix).
Minimum ABP Height ~ Sex + PAP Dispersion
45. FALSE ALARM MINIMISATION 35
Mean ABP Height ~ Sex + PAP Dispersion
Max ABP Height ~ Sex + PAP Dispersion
Minimum ABP Area ~ Min PAP Area + Min ABP Height + Min PAP Height
Mean ABP Area ~ Mean ABP + Max ABP Area + Mean PAP Area
Maximum ABP Area ~ Mean ABP Height + PAP Dispersion
Minimum Width ~ Mean ABP Height + Max ABP Area + ECG Dispersion
Mean Width ~ Mean ABP Height + Max ABP Area + ECG Dispersion
Maximum Width ~ Mean ABP Height + Max ABP Area
CVP ~ Age + Sex
Minimum PAP Area ~ Mean PAP + Min ABP Height
Mean PAP Area ~ Mean PAP + Max ABP Area + Age
Maximum PAP Area ~ Mean PAP + Max Width
Minimum PAP Height ~ Min PAP Area + Min ABP Height
Mean PAP Height ~ Mean PAP + Mean PAP Area
Maximum PAP Height ~ Mean PAP + Max PAP Area
Mean ABP ~ Max ABP Area + Mean ABP Height
ABP Dispersion ~ Mean ABP Height
SpO2 ~ CVP + Max PAP Height + Min ABP Height
46. FALSE ALARM MINIMISATION 36
PAP Dispersion ~ Max PAP Height + Mean PAP Height + Max ABP Height
Finally, I arrived at a final table with 29 columns and 1502 rows.
Table 2. Summary of the dataset
Max Mean Median Min SD
ABPDispersion 22403.77 467.29 459.24 -61726.69 1730.23
ABPMaxHeight 272.40 68.75 67.30 -12.77 24.19
ABPMaxarea 513.00 112.28 107.58 3.61 51.16
ABPMean 180.00 71.80 70.86 -0.37 18.25
ABPMeanArea 282.37 56.63 55.41 -0.08 16.96
ABPMeanHeight 131.94 50.95 50.09 -6.73 18.11
ABPMinHeight 110.40 27.64 23.98 0.00 20.54
ABPMinarea 280.80 27.30 27.02 -26.41 16.61
CVP 346.00 28.32 12.70 -5.00 59.27
Chan 1.00 0.57 1.00 0.00 0.50
ECGDispersion 8910718.89 -30713.11 -15454.28 -14607261.09 590278.99
ECGMean 5.98 0.01 -0.00 -0.80 0.17
Gender 1.00 0.69 1.00 0.00 0.46
MaxWidth 7.32 0.95 0.90 -0.37 0.62
MeanWidth 1.57 -0.02 0.02 -0.56 0.27
MinWidth 1.56 -0.49 -0.49 -4.14 0.35
PAPDispersion 4412.38 511.59 484.39 -2374.95 270.54
PAPMaxHeight 138.40 34.10 32.80 0.00 15.40
PAPMaxarea 338.86 49.99 44.74 -10.14 30.63
PAPMean 179.88 30.53 29.40 -26.30 12.06
PAPMeanArea 152.08 24.09 22.77 -15.19 10.37
PAPMeanHeight 73.52 22.47 21.65 0.00 9.93
PAPMinHeight 52.00 11.27 10.00 0.00 8.22
PAPMinarea 70.51 10.55 10.98 -46.35 7.44
SpO2 100.00 57.52 94.00 0.00 47.46
TrueFalse 3.00 1.86 1.00 1.00 0.99
age 90.00 66.37 66.00 37.00 10.70
Research Instrument
With the data now ready, I set about applying a similar method to
Baumgartner et al. (2012).
47. FALSE ALARM MINIMISATION 37
Discretisation
Baumgartner, et al., (2012) started by discretising the data. βMany real-
world classification tasks exist that involve continuous features where such
algorithms could not be applied unless the continuous features are first
discretized [sic]β (Doughery, Kohavi, & Sahami, 1995). Baumgartner, et al.,
(2012) used 6-bin equal frequency discretisation. Equal frequency
discretisation involves splitting data into equal sized bins, in our case, six, as I
wanted to replicate Baumgartner, et al.βs methods. Therefore, for each feature,
there will be six possible values.
I have 1502 values which means each bin will contain between 250
and 251 values ideally. However, as there are repeated values, one of the
conditions I set was that common values could not be split into separate bins.
As a result, there sometimes isnβt an exact equal frequency.
Figure 11. Sample discretisation
The SpO2 value was split into two bins as there were far too many 0
values for six bins.
For simplicity and clarity, I changed the outcome variable from:
1= True, 3 = False
To
1 = True, 0=False.
48. FALSE ALARM MINIMISATION 38
I changed arrhythmia type from a single categorical variable to five
dummy variable: Asystole, Tachy, Brady, VFib, and V-Tach.
Normalisation
I wanted two datasets for this experiment to compare results from both
sets. With the discrete set ready, I also created a second continuous set, where
the variables were all standardised/normalised. I normalised the data using the
following formula, for each feature:
π₯ β π₯Μ
π
,
π₯Μ = mean of feature set
π = standard deviation of feature set
The features needed to be scaled for clustering so that each feature
would have equal weight. Otherwise features, with larger variance would have
more influence on the equation.
Cross-validation
In order to attain fair results, and to avoid overfitting of the model to
the train data, I applied a 10-fold cross-validation.
10-fold cross-validation involves partitioning the data into 10 folds of
equal size (approximately 150 observations in each fold). 9 folds are used to
train the data and the final fold is used for testing. The process is repeated for
all combinations of train data and test data, meaning every fold is used once
for testing and 9 times for training. 10 results are obtained and the 10 results
are averaged to obtain a single estimation.
I used scikit-learn, a Python machine learning module. scikit-learn has
a function cross_validation.KFold() which performed the 10-fold cross-
validation. It assigned data to each fold randomly.
49. FALSE ALARM MINIMISATION 39
Feature selection
I created subsets of the data with reduced number of features. The
criteria for these subsets was that they must have a Pearsonβs correlation
coefficient with an absolute value greater than 0.1 with the outcome variable
TrueFalse. This left 13 features.
A further subset was the same as above with the arrhythmia type also
removed. This left 9 features.
Classification model
I used a Support Vector Machine (SVM) as the classification model,
because it was the best performing model that Baumgartner et al, (2012) used
on the entire dataset. The results of the SVM were compared to the results of
the clustering.
βSupport Vector Machines are based on the concept of decision
planes that define decision boundaries. A decision plane is one that
separates between a set of objects having different class memberships.
A schematic example is shown in the illustration below. In this
example, the objects belong either to class GREEN or RED. The
separating line defines a boundary on the right side of which all objects
are GREEN and to the left of which all objects are RED. Any new
object (white circle) falling to the right is labeled, i.e., classified, as
GREEN (or classified as RED should it fall to the left of the separating
50. FALSE ALARM MINIMISATION 40
line).β (StatSoft Inc., 2015)
Figure 12. Linear Support Vector Machine classifier
The above shows a support vector machine in 2D space, but
support vector machines are complex algorithms that can operate in
multidimensional space.
The lines, or hyperplane classifiers are not limited by linearity
and can be transformed using a kernel function. In my case, I used a
Gaussian kernel, also known as Radial Basis Function (RBF).
RBF:
πβπΎ|π₯β π₯β²|
2
Where Ξ³ is an adjustable parameter, π₯ β π₯β²
= the error term.
The support vector machine was implemented using scikit-
learnβs svm.SVC() function, a support vector classifier.
The following parameters were set:
51. FALSE ALARM MINIMISATION 41
ο· C = 1: C is the penalty parameter of the error term. This
determines how well it learns from errors in the training
set. However if this is too large there will be overfitting
to the training set.
ο· Ξ³ = 0.01: Coefficient of the kernel
Support Vector Machines return an array of probabilities. In my case
these refer to the probability of the alarm being a true positive and the
probability of an alarm being a false positive. These probabilities are used to
determine the classification. By default, the threshold is 0.5. If the probability
is greater than or equal to 1 it is considered a true alarm and if the probability
is less than 0.5 it is considered a false alarm. However, I altered the threshold
and obtained different results for the following thresholds:
ο· 0.5
ο· 0.25
ο· 0.1
K-means clustering
I used the scaled continuous dataset for both my clustering techniques.
I also applied 10-fold cross validation, similar to SVM. The first clustering
algorithm I implemented was k-means clustering. The following is the k-
means algorithm:
1. K random points are chosen in the sample space. In our case, k
random hypothetical alarms are created, with random features
within the feature space. These random points are our initial
centres.
52. FALSE ALARM MINIMISATION 42
2. For every real alarm, the Euclidean squared distance between
the alarm and each centre is calculated.
Euclidean Squared Distance: β(π₯ ππ β π₯ ππ)
2
π
π=1
,
π = number of features
π, π = 2 instances being compared
3. Each alarm is associated with the centre with which it has the
minimum distance. Every alarm is assigned to a cluster, with
each cluster having a centre.
4. The centre of each cluster is then recomputed to be a
hypothetical point with the minimal within-cluster variation,
(i.e. the average Euclidean squared distance between centre and
alarms).
5. Step 2, 3 and 4 are repeated with the new cluster centres.
6. This is repeated until the cluster centres do not change. i.e. the
clusters cannot get a better within-cluster variation.
I implemented this algorithm using Pythonβs sklearn.cluster.Kmeans,
and set k=27 as a starting and reference point, based on Kanti Mardiaβs rule of
thumb:
π β β
π
2
, with π = number of observations (Mardia, Kent, &
Bibby, 1979)
I created the clusters using the train set and observed the number of
true alarms in each cluster and the number of false alarms in each cluster. If a
cluster had more true alarms than false, I labelled it 1 for true. If a cluster had
more false alarms than true, I labelled it 0 for false.
53. FALSE ALARM MINIMISATION 43
I then assigned the test subjects to the clusters based on the minimal
Euclidean distance to the cluster centre. The test subjects were assigned 0 or 1
based on the label of the cluster to which it was assigned.
I gained an accuracy result this way and quickly established a safer
way to ensure no true positives were misidentified as false positives.
I enforced stricter criteria for alarms being identified as false. The
reason I did this is the fact that we need to be sure an alarm is false if weβre
labelling it a false alarm. As I have stated previously, this is a very important
element of my project. The new criterion was that a cluster could only be
labelled as false (0), if all alarms within the cluster are false alarms. Even if
there is only a small percentage of alarms labelled as true alarms, this
represents a slight, but real risk of true positive misidentification. I reran the
algorithm and achieved very different results.
I then ran the algorithm in a loop with different values for k to find the
optimal number of clusters.
K-medoids clustering
Finally I applied k-medoids clustering. The following is the k-medoids
algorithm, also known as Partitioning Around Medoids (PAM) (Kaufman &
Rousseeuw, 2005):
1. K random points are chosen in the sample space. In our case, k
random real alarms are selected. These alarms are our initial
centres, known as medoids.
2. For every other alarm, the Euclidean squared distance (also
known as the dissimilarity) between the alarm and each medoid
is calculated.
54. FALSE ALARM MINIMISATION 44
3. Each alarm is associated with the medoid with which it has the
minimum distance. Every alarm is assigned to a cluster, with
each cluster having a medoid.
4. The medoid of each cluster is then recomputed to be the alarm
within the cluster with the minimal within-cluster variation,
(i.e. the average Euclidean squared distance between medoid
and alarms).
5. Steps 2, 3 and 4 are repeated with the new medoids.
6. This is repeated until the medoids do not change. i.e. the
clusters cannot get a better within-cluster variation.
This algorithm is more robust than k-means clustering (Kaufman &
Rousseeuw, 2005). I implemented this algorithm using the cluster package in
R. I applied the same procedure for assigning test subjects to clusters as I did
with k-means, where a cluster could only be labelled false, if no true alarms
from the train set were assigned to it. I ran the algorithm over a loop to find
the optimal k value (number of clusters).
I then produced a Receiver Operating Characteristics (ROC) Curve on
the best model using the pROC package in R (Robin, et al., 2011). The ROC
curve plots the true positive rate (Sensitivity) against the false positive rate
(100-Specificity).
55. FALSE ALARM MINIMISATION 45
CHAPTER 4: RESULTS AND DISCUSSION
Results
Support Vector Machines
The following are the results for the support vector machine
classification algorithm. I performed nine tests in total, six with scaled and
continuous data and three with discrete data. None of the models achieved a
0% True Positives Misidentification.
Table 3. Results of Support Vector Machine model using scaled and continuous data
Scaled and Continuous Data
# features p False
Positive
Identification
True Positives
Misidentification
Overall
Accuracy
13 0.5 59.268% 27.823% 70.758%
0.25 39.254% 12.291% 69.823%
0.1 15.879% 2.265% 64.361%
9 0.5 53.368% 26.511% 69.23%
0.25 25.957% 7.94% 66.956%
0.1 5.872% 0.417% 59.359%
Table 4. Results of Support Vector Machine model using discrete data and all features
Discrete data
# features p False
Positives
Identified
True
Positives
Misidentified
Overall
Accuracy
26 0.5 67.136% 29.835% 70.892%
0.25 42.547% 11.916% 70.692%
0.1 24.732% 2.738% 67.091%
56. FALSE ALARM MINIMISATION 46
K-means clustering
After running the loop I found that k=45 was the optimal for this
algorithm.
Table 5. Results of k-means clustering model
Scaled and continuous data
# features k Cluster
Threshold*
False
Positives
Identified
True
Positives
Misidentified
Overall
Accuracy
30 27 #F > #T 44.866% 21.281% 65.165%
30 27 T=0 1.331% 0% 57.455%
30 45 T=0 6.327% 0% 59.607%
* Cluster Threshold = The criteria for a cluster being labelled false:
#F > #T = The cluster contains more false alarms than true alarms.
T=0 = 0 true alarms in the cluster
k-medoids clustering
As I learned from k-means clustering, I only allowed clusters to be
labelled false if there were 0 true alarms assigned to the cluster. I found the
optimal value of k to be 49.
Table 6. Results of k-medoids clustering model
Scaled and continuous data
# features k Cluster
Threshold
False
Positives
Identified
True
Positives
Misidentified
Overall
Accuracy
30 49 T=0 9.169% 0% 48.334%
57. FALSE ALARM MINIMISATION 47
Given the aim of a 0% True Positive Misidentification, the k-medoids
clustering model was the best model of all that I tested.
Figure 13. Visual representation of results
Figure 14. ROC Curve for k-medoid model
58. FALSE ALARM MINIMISATION 48
Figure 14 shows the Receiver Operating Characteristics (ROC) Curve
for the k-medoids model. The Area Under the Curve (AUC) is 0.4543. The
optimal AUC is 1 which would represent perfect discrimination between false
positives and true positives.
Discussion
ο· k-means clustering can significantly reduce the rate of false
positive arrhythmia alarms without any false negative
alarms.
When the correct criteria are set for labelling an alarm as true
then we can used k-means clustering to reduce the rate of false
positive arrhythmia alarms by 6.327% without any false
negative alarms.
ο· k-medoids clustering can significantly reduce the rate of
false positive arrhythmia alarms without any false negative
alarms.
When the correct criteria are set for labelling an alarm as true
then we can used k-medoids clustering to reduce the rate of
false positive arrhythmia alarms by 9.169% without any false
negative alarms.
ο· k-medoids clustering has a higher false positive reduction
rate than k-means clustering.
This was the expected result as k-medoids clustering is a more
robust algorithm than k-means clustering. K-medoidsβ best
result was 9.169% with 0 true alarms suppressed. K-meansβ
59. FALSE ALARM MINIMISATION 49
best result was 6.327% with 0 true alarms. So K-medoidsβ has a
higher false positive reduction rate than k-means clustering.
The results for Support Vector Machines do not agree with the results
of Baumgartner, et al., (2012) because I used fewer predictor variables and
different feature selection. However the Support Vector Machine in my project
was to work as a control to the clustering algorithm so obtaining the same
results as them does not ultimately affect the conclusion of this project.
These results show that clustering can be used to suppress false alarms
in a safe way. They also show k-medoids is a more effective algorithm to do
this than k-means clustering.
I found that the 49 was the optimal number of clusters in this case.
This number is not definitive as the variability of alarms is larger with larger
datasets and so a greater number of clusters may yield better results with larger
datasets.
Given the tests I carried out, we can be 95% sure that k-medoids
clustering reduces the rate of false alarms by between 6.947% and 11.391%.
60. FALSE ALARM MINIMISATION 50
CHAPTER 5: SUMMARY, CONCLUSIONS AND
RECOMMENDATIONS
The findings of this study are that clustering can reduce the false alarm
rate of ICU bedside monitors without suppressing any true alarms.
Classification algorithms are more effective with overall classification
accuracy of alarms. The Support Vector Machine model has been shown to be
the most effective of the classification algorithms in a previous study which
tested five classification algorithms, but it still suppresses true alarms which is
impractical in an ICU, even when the rate of true alarm suppression is low.
The two clustering algorithms of k-means clustering and k-medoids
clustering are not as accurate but can adapt to ensure the rate of true alarm
suppression is 0%.
k-medoids clustering can reduce false alarm rates in the ICU by
between 6.947% and 11.391%
It is important to note, that although my algorithm achieved a reduction
in false positives without generating false negatives, there may exist other
alarm signatures that were not present in this dataset that may generate false
negatives with this algorithm. As a result this algorithm should be tested on
further, more comprehensive datasets before being implemented in hospitals.
61. FALSE ALARM MINIMISATION 51
References
AAMI. (2002). Cardiac monitors, heart rate meters, and alarms. Arlington:
AAMI.
Aboukhalil, A., Nielsen, L., Saeed, M., Mark, R. G., & Clifford, G. D. (2008).
Reducing False Alarm Rates for Critical Arrhythmias Using the
Arterial Blood Pressure Waveform. J Biomed Inform, 442-451.
Al-Mamory, S., & Zhang, H. (2009). Intrusion detection alarms reduction
using root cause analysis and clustering. Comp. Commun., 419-430.
Al-Mamory, S., & Zhang, H. (2010). New data mining technique to enhance
IDS alarms quality. J. Comp. Virol., 43-55.
American Heart Association. (2015, August 20). Ventricular Fibrillation.
Retrieved from American Heart Association - Building healthier lives,
free of carviovascular diseases and stroke:
http://www.heart.org/HEARTORG/Conditions/Arrhythmia/AboutArrh
ythmia/Ventricular-Fibrillation_UCM_324063_Article.jsp
Baumgartner, B., Rodel, K., & Knoll, A. (2012). A Data Mining Approach to
Reduce the False Alarm Rate of Patient Monitors. IEEE EMBS, 5935-
5938.
Benferhat, S., Boudjelida, A., Tabia, K., & Drias, H. (2013). An intrusion
detection and alert correlation approach based on revising probabilistic
classifiers using expert knowledge. Int. J. Appl. Intell., 520-540.
Caggiano, R. M. (2015, August 19). Asystole: Background, Pathophysiology,
Etiology. Retrieved from Diseases & Conditions - Medscape
Reference: http://emedicine.medscape.com/article/757257-overview
62. FALSE ALARM MINIMISATION 52
Chambrini, M.-C. (2001). Alarms in the intensive care unit: how can the
number of false alarms be reduced? Crit Care, 184-188.
Chitrakar, R., & Chuanhe, H. (2012). Anomaly Detection using Support Vector
Machine Classification with k-Medoids Clustering. Internet (AH-ICI),
Third Asian Himalayas International Conference on (pp. 1-5).
Kathmandu: IEEE.
Chung, D. C. (2015, August 19). ECG - A Pictorial Primer. Retrieved from
Medicine-On-Line.com: http://www.medicine-on-
line.com/html/ecg/e0001en.htm
Clifford, G., Aboukhalil, A., Sun, J., Zong, W., Janz, B., Moody, G., & Mark,
R. (2006). Using the Blood Pressure Waveform to Reduce Critical
False ECG Alarms. Computers in Cardiology, 829-832.
Compton, S. J. (2015, August 19). Ventricular Tachycardia: Practice
Essentials, Background, Pathophysiology. Retrieved from Diseases &
Conditions - Medscape Reference:
http://emedicine.medscape.com/article/159075-overview#a3
Cronin, M. (2014). ST6030 Foundations of Statistical Data Analysis. Cork:
Department of Statistics, UCC.
Dey, C. (2009). Reducing IDS False Positives Using Incremental Stream
Clustering (ISC) Algorithm. M.Sc. Thesis, Royal Institute of
Technology, Sweden.
Doughery, J., Kohavi, R., & Sahami, M. (1995). Supervised and Unsupervised
Discretization of Continuous Features. The XII International
Conference on Machine Learning (pp. 194-202). Tahoe City: Morgan
Kaufmann.
63. FALSE ALARM MINIMISATION 53
Goldberger, A. L., Amaral, L. A., Glass, L., Hausdorff, J. M., Ivanov, P. C.,
Mark, R. G., . . . Stanley, H. E. (2000). PhysioBank, PhysioToolkit,
and PhysioNet: Components of a New Research Resource for Complex
Physiologic Signals. Circulation, 215-220.
Hu, X., Sapo, M., Nenov, V., Barry, T., Kim, S., Do, D. H., . . . Martin, N.
(2012). Predictive combinations of monitor alarms preceding in-
hospital code blue events. Journal of Biomedical Infomatics, 913-921.
Hubballi, N., & Suryanarayanan, V. (2014). False alarm minimization
techniques in signature-based intrusion detection systems: A survey.
Computer Communications, 1-17.
Julisch, K. (2001). Mining alarm clusters to improve alarm handling
efficiency. ACSAC '01, 12-21.
Julisch, K. (2003a). Using Root Cause Analysis to Handle Intrusion Detection
Alarms. Ph.D. Thesis, IBM Zurich Research Laboratory.
Julisch, K. (2003b). Clustering intrusion detection alarms to support root cause
analysis. ACM TISSEC, 443-471.
Julisch, K., & Dacier, M. (2002). Mining intrusion detection alarms for
actionable knowledge. KDD '02 Proceedings of the eighth ACM
SIGKDD international conference on Knowledge discovery and data
mining, 366-375.
Kaufman, L., & Rousseeuw, P. J. (2005). Finding Groups in Data: An
Introduction to Cluster Analysis. Hoboken: John Wiley & Sons Inc.
Kim, E., Shin, M., & Ryu, K. (2004). False alarm classification model for
network-based intrusion detection system. IDEAL '04, 259-265.
Last Name, F. M. (Year). Article Title. Journal Title, Pages From - To.
64. FALSE ALARM MINIMISATION 54
Last Name, F. M. (Year). Book Title. City Name: Publisher Name.
Law, K., & Kwok, L. (2004). IDS false alarm filtering using knn classifier.
WISA '04, 114-121.
Lawless, S. T. (1994). Crying Wolf: False alarms in a pediatric intensive care
unit. Critical Care Medicine, 981-985.
Lincoln Laboratory, MIT. (2015, August 19). MIT Lincoln Laboratory:
DARPA Intrusion Detection Evaluation. Retrieved from MIT -
Massachusetts Institute of Technology:
http://www.ll.mit.edu/ideval/docs/
Manganaris, S., Christensen, M., Zerkle, D., & Hermiz, K. (2000). A data
mining analysis of RTID alarms. Comp. Netw., 571-577.
Mardia, K. V., Kent, J. T., & Bibby, J. M. (1979). Multivariate Analysis.
London: Academic Press.
Massachusetts Institute of Technology. (2015, July 29). PhysioNet. Retrieved
from PhysioNet: http://www.physionet.org
Medical Training and Simulation LLC. (2015, August 20). Ventricular
Fibrillation EKG Reference. Retrieved from ECG | EKG | Heart
Sounds | Murmurs | Lung Sounds | Hundreds of free lessons and drills |
Practical Clinical Skills: http://www.practicalclinicalskills.com/ekg-
reference-guide-details.aspx?lessonID=26
NIH. (2011, July 1). What Is an Arrhythmia? - NHLBI, NIH. Retrieved from
National Institutes of Health (NIH):
https://www.nhlbi.nih.gov/health/health-topics/topics/arr
65. FALSE ALARM MINIMISATION 55
Parikh, D., & Chen, T. (2008). Data fusion and cost minimization for intrusion
detection. IEEE Transactions on Information Forensics and Security,
381-390.
Perdisci, R., Giacinto, G., & Roli, F. (2006). Alarm clustering for intrusion
detection systems in computer networks. Eng. Appl. Artif. Intell., 429-
438.
Pietraszek, T. (2004). Using adaptive alert classification to reduce false
positives in intrusion detection. RAID '04, 102-124.
Pietraszek, T., & Tanner, A. (2005). Data mining and machine learning -
towards reducing false positives in intrusion detection. Inform. Sec.
Tech. Rep., 169-183.
Resuscitation Council (UK). (2015, August 19). Rescuscitation Council (UK) -
Advanced Life Support - Bradycardia. Retrieved from Welcome to the
Resuscitation Council (UK) E-Learning Website:
https://lms.resus.org.uk/modules/m55-v2-
bradycardia/10346/m55/t05/content/m55_t05_005sr.htm?next
Robin, X., Turck, N., Hainard, A., Tiberti, N., Lisacek, F., Sanchez, J.-C., &
Muller, M. (2011). pROC: an open-source package for R and S+ to
analyze and compare ROC curves. BMC Informatics, 77.
Sadoddin, R., & Ghorbani, A. (2009). An incremental frequent structure
mining framework for real-time alert correlation. Comp. Sec., 153-173.
Saeed, M., Lieu, C., Raber, G., & Mark, R. (2002). MIMIC II: A Massive
Temporal ICU Patient Database to Support Research in Intelligent
Patient Monitoring. Computers in Cardiology, 641-644.
66. FALSE ALARM MINIMISATION 56
Siebig, S., Kuhls, S., Imhoff, M., Langgartner, J., Reng, M., SchΓΆlmerich,
J., . . . Wrede, C. (2010). Collection of annotated data in a clinical
validation study for alarm algorithms in intensive care--a methodologic
framework. J Crit Care, 128-35.
Soleimani, M., & Ghorbani, A. (2008). Critical episode mining in intrusion
detection alerts. Proceedings of the Communication Networks and
Services Research Conference, IEEE Computer Society, 157-164.
Sowan, A. K., Tarriela, A. F., Gomez, T. M., Reed, C. C., & Rapp, K. M.
(2015). Nurses' Perceptions and Practices Toward Clinical Alarms in a
Transplant Cardiac Intensive Care Unit: Exploring Key Issues Leading
to Alarm Fatigue. JMIR Human Factors, 28-37.
StatSoft Inc. (2015, August 21). Support Vector Machines (SVM). Retrieved
from Big Data Analytics, Enterprise Analytics, Data Mining Software,
Statistical Analysis, Predictive Analytics:
http://www.statsoft.com/Textbook/Support-Vector-Machines
Thomas, C., & Balakrishnan, N. (2008). Performance enhancement of
intrusion detection systems using advances in sensor fusion. Fusion
'08, 1671-1677.
Tjhai, G., Furnell, S., Papadaki, M., & Clarke, N. (2010). A preliminary two-
stage alarm correlation and filtering system using SOM neural network
and k-means algorithm. Computers and Security, 712-723.
67. FALSE ALARM MINIMISATION 57
Appendix
Regression Model summary output:
ο· Predicting Minimum ABP Height
lm(formula = ABPMinHeight ~ Gender + PAPDispersion, data =
train.df)
Residuals:
Min 1Q Median 3Q Max
-49.666 -16.939 -1.155 14.693 67.817
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 34.409193 1.467322 23.450 < 2e-16 ***
Gender -15.242222 1.309088 -11.643 < 2e-16 ***
PAPDispersion 0.007456 0.002257 3.303 0.000983 ***
---
Signif. codes: 0 β***β 0.001 β**β 0.01 β*β 0.05 β.β 0.1 β β 1
Residual standard error: 21.17 on 1246 degrees of freedom
(1 observation deleted due to missingness)
Multiple R-squared: 0.09968, Adjusted R-squared: 0.09823
F-statistic: 68.97 on 2 and 1246 DF, p-value: < 2.2e-16
ο· Predicting Mean ABP Height
lm(formula = ABPMeanHeight ~ Gender + PAPDispersion + age, data
= train.df)
Residuals:
Min 1Q Median 3Q Max
-75.935 -10.357 1.244 10.554 58.894
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 41.575676 3.347654 12.419 < 2e-16 ***
Gender -15.535606 1.087892 -14.280 < 2e-16 ***
PAPDispersion 0.018319 0.001874 9.775 < 2e-16 ***
age 0.162619 0.047108 3.452 0.000575 ***
---
Signif. codes: 0 β***β 0.001 β**β 0.01 β*β 0.05 β.β 0.1 β β 1
Residual standard error: 17.57 on 1245 degrees of freedom
(1 observation deleted due to missingness)
Multiple R-squared: 0.1853, Adjusted R-squared: 0.1833
F-statistic: 94.36 on 3 and 1245 DF, p-value: < 2.2e-16
ο· Predicting Max ABP Height
lm(formula = ABPMaxHeight ~ Gender + PAPDispersion + age, data
= train.df)
Residuals:
Min 1Q Median 3Q Max
-111.294 -13.560 -0.347 11.057 198.424
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 49.553921 4.585646 10.806 < 2e-16 ***
68. FALSE ALARM MINIMISATION 58
Gender -16.170207 1.490204 -10.851 < 2e-16 ***
PAPDispersion 0.026496 0.002567 10.322 < 2e-16 ***
age 0.254135 0.064530 3.938 8.66e-05 ***
---
Signif. codes: 0 β***β 0.001 β**β 0.01 β*β 0.05 β.β 0.1 β β 1
Residual standard error: 24.06 on 1245 degrees of freedom
(1 observation deleted due to missingness)
Multiple R-squared: 0.1489, Adjusted R-squared: 0.1468
F-statistic: 72.6 on 3 and 1245 DF, p-value: < 2.2e-16
ο· Predicting Minimum ABP Area
lm(formula = ABPMinarea ~ PAPMinarea + ABPMinHeight +
PAPMinHeight,
data = train.df)
Residuals:
Min 1Q Median 3Q Max
-60.045 -5.438 -2.097 5.372 192.571
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.21008 0.65897 7.906 5.80e-15 ***
PAPMinarea 1.11374 0.05659 19.679 < 2e-16 ***
ABPMinHeight 0.27202 0.01716 15.848 < 2e-16 ***
PAPMinHeight 0.21420 0.04807 4.456 9.11e-06 ***
---
Signif. codes: 0 β***β 0.001 β**β 0.01 β*β 0.05 β.β 0.1 β β 1
Residual standard error: 12.04 on 1245 degrees of freedom
(1 observation deleted due to missingness)
Multiple R-squared: 0.5615, Adjusted R-squared: 0.5605
F-statistic: 531.5 on 3 and 1245 DF, p-value: < 2.2e-16
ο· Predicting Mean ABP Area
lm(formula = ABPMeanArea ~ ABPMean + ABPMaxarea + PAPMeanArea,
data = train.df)
Residuals:
Min 1Q Median 3Q Max
-37.639 -4.872 -0.647 4.228 123.367
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -9.353212 1.118366 -8.363 <2e-16 ***
ABPMean 0.657800 0.014776 44.519 <2e-16 ***
ABPMaxarea 0.095570 0.005109 18.705 <2e-16 ***
PAPMeanArea 0.334689 0.024584 13.614 <2e-16 ***
---
Signif. codes: 0 β***β 0.001 β**β 0.01 β*β 0.05 β.β 0.1 β β 1
Residual standard error: 8.941 on 1244 degrees of freedom
(2 observations deleted due to missingness)
Multiple R-squared: 0.768, Adjusted R-squared: 0.7674
F-statistic: 1372 on 3 and 1244 DF, p-value: < 2.2e-16
ο· Predicting Maximum ABP Area
lm(formula = ABPMaxarea ~ ABPMeanHeight + PAPDispersion, data =
train.df)
69. FALSE ALARM MINIMISATION 59
Residuals:
Min 1Q Median 3Q Max
-98.97 -31.25 -12.67 14.87 393.45
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 92.518533 4.924250 18.788 < 2e-16 ***
ABPMeanHeight 0.249469 0.082767 3.014 0.00263 **
PAPDispersion 0.013733 0.005987 2.294 0.02198 *
---
Signif. codes: 0 β***β 0.001 β**β 0.01 β*β 0.05 β.β 0.1 β β 1
Residual standard error: 55.69 on 1246 degrees of freedom
(1 observation deleted due to missingness)
Multiple R-squared: 0.01409, Adjusted R-squared: 0.01251
F-statistic: 8.905 on 2 and 1246 DF, p-value: 0.0001446
ο· Predicting Minimum Width
lm(formula = MinWidth ~ ABPMeanHeight + ABPMaxarea +
ECGDispersion,
data = train.df)
Residuals:
Min 1Q Median 3Q Max
-3.4533 -0.2469 0.0409 0.3166 1.6659
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -4.872e-01 3.539e-02 -13.766 < 2e-16 ***
ABPMeanHeight 1.197e-03 5.548e-04 2.157 0.03117 *
ABPMaxarea -5.760e-04 1.921e-04 -2.998 0.00277 **
ECGDispersion -5.167e-08 1.714e-08 -3.015 0.00263 **
---
Signif. codes: 0 β***β 0.001 β**β 0.01 β*β 0.05 β.β 0.1 β β 1
Residual standard error: 0.3782 on 1247 degrees of freedom
Multiple R-squared: 0.01591, Adjusted R-squared: 0.01354
F-statistic: 6.72 on 3 and 1247 DF, p-value: 0.0001692
ο· Predicting Mean Width
lm(formula = MeanWidth ~ ABPMeanHeight + ABPMaxarea +
ECGDispersion,
data = train.df)
Residuals:
Min 1Q Median 3Q Max
-0.55950 -0.10260 -0.01259 0.09174 1.02648
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.050e-01 1.591e-02 6.596 6.24e-11 ***
ABPMeanHeight -2.085e-03 2.495e-04 -8.358 < 2e-16 ***
ABPMaxarea 6.834e-04 8.638e-05 7.912 5.57e-15 ***
ECGDispersion -2.296e-08 7.708e-09 -2.979 0.00294 **
---
Signif. codes: 0 β***β 0.001 β**β 0.01 β*β 0.05 β.β 0.1 β β 1
Residual standard error: 0.1701 on 1247 degrees of freedom
Multiple R-squared: 0.09761, Adjusted R-squared: 0.09543
F-statistic: 44.96 on 3 and 1247 DF, p-value: < 2.2e-16
70. FALSE ALARM MINIMISATION 60
ο· Predicting Max Width
lm(formula = MaxWidth ~ ABPMeanHeight + ABPMaxarea, data =
train.df)
Residuals:
Min 1Q Median 3Q Max
-2.2924 -0.2980 -0.0558 0.2162 4.8485
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.4821213 0.0527422 9.141 < 2e-16 ***
ABPMeanHeight -0.0056265 0.0008257 -6.814 1.47e-11 ***
ABPMaxarea 0.0067383 0.0002864 23.529 < 2e-16 ***
---
Signif. codes: 0 β***β 0.001 β**β 0.01 β*β 0.05 β.β 0.1 β β 1
Residual standard error: 0.5643 on 1248 degrees of freedom
Multiple R-squared: 0.315, Adjusted R-squared: 0.3139
F-statistic: 286.9 on 2 and 1248 DF, p-value: < 2.2e-16
ο· Predicting CVP
lm(formula = CVP ~ age + Gender, data = train.df)
Residuals:
Min 1Q Median 3Q Max
-51.496 -25.641 -19.083 -7.097 307.634
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 85.0138 11.7758 7.219 9.0e-13 ***
age -0.7286 0.1697 -4.293 1.9e-05 ***
Gender -12.0116 3.7771 -3.180 0.00151 **
---
Signif. codes: 0 β***β 0.001 β**β 0.01 β*β 0.05 β.β 0.1 β β 1
Residual standard error: 63.83 on 1262 degrees of freedom
Multiple R-squared: 0.02258, Adjusted R-squared: 0.02103
F-statistic: 14.57 on 2 and 1262 DF, p-value: 5.525e-07
ο· Predicting Minimum PAP Area
> PMinA.lm<-lm(PAPMinarea~PAPMean+ABPMinHeight,data=train.df)
> summary(PMinA.lm)
Call:
lm(formula = PAPMinarea ~ PAPMean + ABPMinHeight, data =
train.df)
Residuals:
Min 1Q Median 3Q Max
-33.880 -3.385 0.121 3.332 55.142
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -3.820926 0.475902 -8.029 2.19e-15 ***
PAPMean 0.337027 0.013056 25.815 < 2e-16 ***
ABPMinHeight 0.147671 0.007316 20.186 < 2e-16 ***
71. FALSE ALARM MINIMISATION 61
---
Signif. codes: 0 β***β 0.001 β**β 0.01 β*β 0.05 β.β 0.1 β β 1
Residual standard error: 5.77 on 1299 degrees of freedom
Multiple R-squared: 0.4593, Adjusted R-squared: 0.4585
F-statistic: 551.7 on 2 and 1299 DF, p-value: < 2.2e-16
ο· Predicting Mean PAP Area
> PMeanA.lm<-
lm(PAPMeanArea~PAPMean+ABPMaxarea+age,data=train.df)
> summary(PMeanA.lm)
Call:
lm(formula = PAPMeanArea ~ PAPMean + ABPMaxarea + age, data =
train.df)
Residuals:
Min 1Q Median 3Q Max
-19.309 -2.384 -0.285 1.945 35.328
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -5.662293 0.954847 -5.930 3.88e-09 ***
PAPMean 0.764760 0.010412 73.451 < 2e-16 ***
ABPMaxarea 0.041331 0.002247 18.395 < 2e-16 ***
age 0.026573 0.012122 2.192 0.0286 *
---
Signif. codes: 0 β***β 0.001 β**β 0.01 β*β 0.05 β.β 0.1 β β 1
Residual standard error: 4.419 on 1298 degrees of freedom
Multiple R-squared: 0.829, Adjusted R-squared: 0.8286
F-statistic: 2097 on 3 and 1298 DF, p-value: < 2.2e-16
ο· Predicting Maximum PAP Area
> PMaxA.lm<-
lm(PAPMaxarea~ABPMaxarea+PAPMean+MaxWidth,data=train.df)
> summary(PMaxA.lm)
Call:
lm(formula = PAPMaxarea ~ ABPMaxarea + PAPMean + MaxWidth, data
= train.df)
Residuals:
Min 1Q Median 3Q Max
-109.033 -6.793 -1.518 4.399 197.616
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -35.34370 1.61342 -21.91 <2e-16 ***
ABPMaxarea 0.28853 0.01042 27.68 <2e-16 ***
PAPMean 1.40812 0.03935 35.78 <2e-16 ***
MaxWidth 10.44491 0.85533 12.21 <2e-16 ***
---
Signif. codes: 0 β***β 0.001 β**β 0.01 β*β 0.05 β.β 0.1 β β 1
Residual standard error: 17.34 on 1298 degrees of freedom
Multiple R-squared: 0.7136, Adjusted R-squared: 0.713
F-statistic: 1078 on 3 and 1298 DF, p-value: < 2.2e-16
72. FALSE ALARM MINIMISATION 62
ο· Predicting Minimum PAP Height
> PMinH.lm<-
lm(PAPMinHeight~PAPMinarea+ABPMinHeight,data=train.df)
> summary(PMinH.lm)
Call:
lm(formula = PAPMinHeight ~ PAPMinarea + ABPMinHeight, data =
train.df)
Residuals:
Min 1Q Median 3Q Max
-43.332 -4.079 -1.362 3.687 68.404
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.07348 0.37670 10.814 < 2e-16 ***
PAPMinarea 0.53506 0.02866 18.670 < 2e-16 ***
ABPMinHeight 0.05623 0.01027 5.474 5.27e-08 ***
---
Signif. codes: 0 β***β 0.001 β**β 0.01 β*β 0.05 β.β 0.1 β β 1
Residual standard error: 7.332 on 1299 degrees of freedom
Multiple R-squared: 0.3047, Adjusted R-squared: 0.3036
F-statistic: 284.6 on 2 and 1299 DF, p-value: < 2.2e-16
ο· Predicting Mean PAP Height
> PMeanH.lm<-
lm(PAPMeanHeight~PAPMean+PAPMeanArea,data=train.df)
> summary(PMeanH.lm)
Call:
lm(formula = PAPMeanHeight ~ PAPMean + PAPMeanArea, data =
train.df)
Residuals:
Min 1Q Median 3Q Max
-81.101 -5.476 0.042 5.624 55.212
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 10.81927 0.71443 15.144 < 2e-16 ***
PAPMean 0.25678 0.04579 5.608 2.5e-08 ***
PAPMeanArea 0.15842 0.05259 3.012 0.00264 **
---
Signif. codes: 0 β***β 0.001 β**β 0.01 β*β 0.05 β.β 0.1 β β 1
Residual standard error: 9.448 on 1299 degrees of freedom
Multiple R-squared: 0.1993, Adjusted R-squared: 0.1981
F-statistic: 161.7 on 2 and 1299 DF, p-value: < 2.2e-16
ο· Predicting Maximum PAP Height
> PMaxH.lm<-lm(PAPMaxHeight~PAPMean+PAPMaxarea,data=train.df)
> summary(PMaxH.lm)
Call:
lm(formula = PAPMaxHeight ~ PAPMean + PAPMaxarea, data =
73. FALSE ALARM MINIMISATION 63
train.df)
Residuals:
Min 1Q Median 3Q Max
-114.183 -8.232 -0.822 6.912 91.303
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 17.40842 1.13299 15.365 <2e-16 ***
PAPMean 0.49323 0.04090 12.059 <2e-16 ***
PAPMaxarea 0.03272 0.01549 2.113 0.0348 *
---
Signif. codes: 0 β***β 0.001 β**β 0.01 β*β 0.05 β.β 0.1 β β 1
Residual standard error: 14.97 on 1299 degrees of freedom
Multiple R-squared: 0.1671, Adjusted R-squared: 0.1658
F-statistic: 130.3 on 2 and 1299 DF, p-value: < 2.2e-16
ο· Predicting Mean ABP
> AMean.lm<-lm(ABPMean~ABPMaxarea+ABPMeanHeight,data=train.df)
> summary(AMean.lm)
Call:
lm(formula = ABPMean ~ ABPMaxarea + ABPMeanHeight, data =
train.df)
Residuals:
Min 1Q Median 3Q Max
-64.358 -8.482 -0.187 7.993 109.597
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 46.200193 1.541807 29.965 <2e-16 ***
ABPMaxarea 0.126373 0.008485 14.893 <2e-16 ***
ABPMeanHeight 0.224068 0.024127 9.287 <2e-16 ***
---
Signif. codes: 0 β***β 0.001 β**β 0.01 β*β 0.05 β.β 0.1 β β 1
Residual standard error: 16.72 on 1445 degrees of freedom
Multiple R-squared: 0.1904, Adjusted R-squared: 0.1893
F-statistic: 169.9 on 2 and 1445 DF, p-value: < 2.2e-16
ο· Predicting ABP Dispersion
> ADisp.lm<-lm(ABPDispersion~ABPMeanHeight,data=train.df)
> summary(ADisp.lm)
Call:
lm(formula = ABPDispersion ~ ABPMeanHeight, data = train.df)
Residuals:
Min 1Q Median 3Q Max
-61685 -82 8 104 21977
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -107.314 136.207 -0.788 0.431
ABPMeanHeight 11.268 2.513 4.485 7.88e-06 ***
---
74. FALSE ALARM MINIMISATION 64
Signif. codes: 0 β***β 0.001 β**β 0.01 β*β 0.05 β.β 0.1 β β 1
Residual standard error: 1751 on 1446 degrees of freedom
Multiple R-squared: 0.01372, Adjusted R-squared: 0.01304
F-statistic: 20.11 on 1 and 1446 DF, p-value: 7.885e-06
ο· Predicting SpO2
> SpO2.lm<-lm(SpO2~CVP+PAPMaxHeight+ABPMinHeight,data=train.df)
> summary(SpO2.lm)
Call:
lm(formula = SpO2 ~ CVP + PAPMaxHeight + ABPMinHeight, data =
train.df)
Residuals:
Min 1Q Median 3Q Max
-99.81 -51.14 26.31 41.01 61.44
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 32.04407 3.37212 9.503 < 2e-16 ***
CVP 0.14184 0.02009 7.060 2.55e-12 ***
PAPMaxHeight 0.37925 0.07758 4.889 1.13e-06 ***
ABPMinHeight 0.30704 0.05805 5.290 1.41e-07 ***
---
Signif. codes: 0 β***β 0.001 β**β 0.01 β*β 0.05 β.β 0.1 β β 1
Residual standard error: 46.09 on 1491 degrees of freedom
Multiple R-squared: 0.06358, Adjusted R-squared: 0.06169
F-statistic: 33.74 on 3 and 1491 DF, p-value: < 2.2e-16
ο· Predicting PAP Dispersion
> PD.lm<-
lm(PAPDispersion~PAPMaxHeight+PAPMeanHeight+ABPMaxHeight,data=t
rain.df)
> summary(PD.lm)
Call:
lm(formula = PAPDispersion ~ PAPMaxHeight + PAPMeanHeight +
ABPMaxHeight,
data = train.df)
Residuals:
Min 1Q Median 3Q Max
-3055.7 -115.9 -22.0 80.4 3707.7
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 150.9331 22.4683 6.718 2.62e-11 ***
PAPMaxHeight 3.8337 0.7679 4.993 6.65e-07 ***
PAPMeanHeight 3.3647 1.1833 2.843 0.00452 **
ABPMaxHeight 2.2445 0.2680 8.375 < 2e-16 ***
---
Signif. codes: 0 β***β 0.001 β**β 0.01 β*β 0.05 β.β 0.1 β β 1
Residual standard error: 246.2 on 1497 degrees of freedom
Multiple R-squared: 0.1737, Adjusted R-squared: 0.1721
F-statistic: 104.9 on 3 and 1497 DF, p-value: < 2.2e-16
75. FALSE ALARM MINIMISATION 65
Python code
cfr = svm.SVC(C=1,cache_size=1000,gamma=0.01,probability=True)
cv = cross_validation.KFold(len(train),
n_folds=10,indices=False)
results = []
falses=[]
trues=[]
for traincv, testcv in cv:
probas=cfr.fit(train[traincv],target[traincv]).predict_proba(tr
ain[testcv])
probs=[x[1] for x in probas]
predictions=[]
for p in probs:
if p>=0.1:
predictions+=[1]
else:
predictions+=[0]
s,f,t=successrate(predictions,list(target[testcv]))
results.append(s)
falses.append(f)
trues.append(t)
ROC Output
> roc(resp,pred,plot=TRUE)
Call:
roc.default(response = resp, predictor = pred, plot = TRUE)
Data: pred in 648 controls (resp 0) < 854 cases (resp 1).
Area under the curve: 0.4543