Severity of illness scoring systems have been developed to evaluate delivery of care and provide prediction of outcome of groups of critically ill patients who are admitted to the intensive care units. This prediction is achieved by collating routinely measured data specific to the patient. This article reviews the various commonly used ICU scoring systems, the characteristics of the ideal scoring system, the various methods used for validating the scoring systems.
2. Review Article
INTRODUCTION
Clinical outcome is the important measure of the
critical care activity. It is the end result of all therapeutic
interventions applied to the patient. It can be measured
from the perspective of the patients, care-takers or
various health care personnel. It is the end point of the
research, audit benchmarks performance and making
comparisons, and helps in allocation of funds as resources
are scant in comparison to the number of patients.
Intensive care given to the non-survivors costs twice as
much survivors [1] a major determinant of this wasted
expenditure is prognostic uncertainty [2]. Prognostic tools
may help physicians in difficult task of redirecting the
resources to the patient who are more likely to benefit with
themorechanceoflongtermsurvival.Accurateandreliable
data helps the health managers, economists and politicians
to resolve the conflicts concern-ing areas of health care, not
only with in the hospital but establish appropriate balance
between primary and secon-dary health care as their tasks
involves distributive justice to maximize good for whole of
the society. Such data are lacking in our country in spite of
the large number of patient population and widespread
healthcare facilities.
NEED OF SCORING SYSTEMS
(i) Prognosis
(ii) Cost-benefit analysis
(iii) Withdrawal of treatment
(iv) Comparison between different centres
INTENSIVE CARE UNIT SCORING SYSTEMS
Chitra Chatterji* and Anupam Raj**
*Consultant, **Registrar, Department of Anesthesiology, Indraprastha Apollo Hospitals,
Sarita Vihar, New Delhi 110 076, India.
Correspondence to: Dr Anupam Raj, Registrar, Department of Anesthesiology, Indraprastha Apollo Hospitals,
Sarita Vihar, New Delhi 110 076, India.
E-mail: anupamraj21@gmail.com
Severity of illness scoring systems have been developed to evaluate delivery of care and provide prediction of
outcome of groups of critically ill patients who are admitted to the intensive care units. This prediction is
achieved by collating routinely measured data specific to the patient. This article reviews the various
commonly used ICU scoring systems, the characteristics of the ideal scoring system, the various methods
used for validating the scoring systems.
Key word: ICU scoring systems.
(v) Monitoring and assessment of new therapies
(vi) Population sample collection in studies
MEASUREMENT OF OUTCOME
All outcome measures have limitations, as the results
for outcome structures and data concerning the outcome
couldbeoverinterpreted.Mostimportantclinicaloutcome
measures from patient’s perspective are:
Survival – long term survival is most important to the
patients. But in case the patient is undergoing a surgical
procedure will also want to know the risk of intervention
and hence the short term mortality.
Functional outcome – depicts the physical and mental
capabilities after recovery. Patients will desire an
independentlife atleasttoaleveloftheir previous level of
activity.
Quality of life – includes patient’s sense of well being and
satisfaction, which are important components of quality of
life.
PREDICTION OF OUTCOME
It is the probabilistic estimation of binary outcome
(death or survival) usually at hospital discharge. The
severity ofscores canbe classified according totheir aim:
(a)Tomeasuretheseveritybyassigningpointsaccordingto
theseverityofillness,(b)Topredictoutcomebyassigninga
numerical estimate of probability of outcome (hospital
mortality) to a group of similar patients.
89 Apollo Medicine, Vol. 8, No. 2, June 2011
3. Review Article
Apollo Medicine, Vol. 8, No. 2, June 2011 90
Severity scoring systems are not perfect. They have
their false positives and false negatives. These scores do
not apply to predict the outcome of specific individual.
Outcome prediction can be made using general
outcome prediction models such models aim to predict
clinical status at hospital discharge based on given set of
variables evaluated on ICU admission or within 24 hrs. So
the rationale behind the construction is that derangement
of homeostasis has an adverse effect on mortality and the
magnitude of change from normal for physiological and
laboratory variables is proportional to their effect on
outcome. Using logistic regression equation these predict
the outcome in the patients having a particular past
medical history and acute medical condition (defined by
these values of predictive variation) and receiving
treatment in a (theoretical) reference ICU. Such models
were developed based on the multicenter database. Ideally
these should be [3].
(i) Time insensitive i.e. provide an accurate mortality
prediction when used prospectively as well as
retrospectively.
(ii) A true estimate of presenting risk of death i.e. the
measure of severity of illness from data which is not
influenced by therapy. Assessment of severity of
illness from data within first 24 hrs (APACHE II)
may become a measure of suboptimal care than
severity of illness.
(iii) Calculated from data collected in usual care of
patient, e.g., pulse rate, blood pressure, temperature
etc. If weight is given to data gained from the use of
complex and expensive equipment, hospitals will be
rewarded from using more tests regardless of the
appro-priateness or quality of care.
(iv) Calculated from objective data that cannot be
manipulated ( unlike the subjective data).
(v) Accurate at all levels of the scale i.e. the system must
have an accurate calibration. Current mortality
systems use regression technique that tend to
underestimate the likelihood of death from more
severe patients and over estimate from less severe
patients.
(vi) Simple, reliable, easily obtainable and not markedly
add to the administrative cost.
(vii) All components of the system should be open to
review, i.e. the system (particularly the prediction
equations) must be able to be scrutinized and tested.
(viii)Wide patient applicability: different ICUs, all age
groups, all levels and types of ICUs.
VALIDITY OF PREDICTIVE SCORES
All statistical models need validation. This evaluation
should be performed on regional and national levels and it
should consists of (a) Overall fitness of good
(discrimination and calibration), (b) Uniformity of fit.
Overall fitness of good
Can be divided into discrimination and calibration
Discrimination [4]: it is the ability of the score to
distinguish survivors from non-survivors and is usually
evaluated using the area under receiver operating
characteristic (ROC) curve. Interpretation of
discrimination is easy, a perfect model will have area
under ROC curve of 1.0 and a model whose discriminative
capabilities is no greater than the chance has an area of
0.5. For most models this value should be greater than 0.8.
greater the area under ROC curve better the discrimination
of the model.
Calibration [4]: it is the correspondence of degree of
probabilities assigned by the model or observed
mortalities. It is usually evaluated by two statistical tests
proposed by Hosmer and Lemeshow ( the C test and the H
test), which divides the population of deciles of risk and
compare the expected with the actual number of survivors
and non- survivors in each deciles. A more intuitive
evaluation although less formal can be made by the use of
calibration curves. Greater the agreement between the
observed and predicted mortality, the better the calibration
of the model.
Uniformity of fit
It reflects the models performance in the subgroup of
patients. The rationale of this evaluation is similar to that
of general regression, where influential observation that
can have important impact in the overall performance of
model are specifically explored. Although no consensus
exists but the best technique for identifying these sub
groups, we can use discrimination and calibration in sub-
groups of patients. The most influential factors are those
related with major case mix components:
• Location in hospital before ICU admission
• Patient type (emergency surgery, non-operative,
scheduled surgery)
• Degree of physiological dysfunction
• Physiological reserve ( age, chronic diagnosis)
• Acute diagnosis
4. Review Article
91 Apollo Medicine, Vol. 8, No. 2, June 2011
HISTORY OF SCORING SYSTEMS
1953-APGAR score by Virginia apgar
1973- glasglow coma score
1981- APACHE (acute physiology and chronic health
evaluation) and SAP ( simplified acute physiology
score)
1991-APAHE III
1993- SAPS II
2005- SAPS III
During the development of scoring systems main
prognostic determinant of outcome changed (Table 1).
ACUTE PHYSIOLOGY AND CHRONIC HEALTH
EVALUATION SCORE
It was proposed by William Knaus in 1985 [9] and
Initially had 34 physiological variablesAPACHE II has 12
variables. It allows probability of death before discharge
from the hospital.It is one of the most commonly used
scores in the ICUs. Initial score is commonly calculated as
the worst score among the observed values in first 24 hrs.
APACHE III introduced in 1991 has 6 additional
physiological parameters in addition to those present in
APACHE II i.e. blood urea nitrogen, bilirubin, PaCO2,
serum glucose, urine output, serum albumin. PaCO2 is
combined with pH s single acid-base variable. Potassium
is not used, GCS is in the abbreviated form.Aminimum of
9 physiological values are required to provide a score. The
chronic health component has 6 questions regarding -
haematological malignancies, metastatic cancer, immune
suppression, hepatic failure, cirrhosis, AIDS. The
performance of the APACHE III severity score is slightly
better than that of APACHE II, but the former has not
achieved widespread acceptance perhaps because the
statistical analysis used to score it is under copyright
control.
APACHE IV was introduced in 2006. It is a complex
score with 142 variables. The score is not applicable for
<16years, burn patients and patients shifted from other
ICUs. It also has a separate scoring system for post CABG.
Disease specific scores includes 116 disease categories.
Web based calculation can be done at cerner.com
Criticism of APACHE
• Inability to calculate initial score because the score
includes the worst values in first 24hrs.
• The sensitivity of the score to precision of measuring
the variables - ICUs sing continuous electronic
monitoring of relevant variables typically evaluate
patients as more severely ill.
• Proprietary nature of APACHE III and associated
cost of instrument
Table 1. Classification
General Specialsed and Trauma score Therapeutic intervention
scores surgical intensive nursing score
care
SAPS [5]APACHESOFA Lung resection score ISS (injury severity score) TISS (therapeutic
(sequential organ EUROSCOREONTARIO RTS (revised trauma score) intervention scoring
failure assessment) Parsonnet score system 97 TRISS ( trauma injury severity system) TISS-28
MODS (multi organ score QMMI score POSSUM score)ASCOT (a severity (simplified TISS) NEMS
dysfunction score)[6] (physiological and operative characterisation of trauma) (nine equivalents of
severity score for enumeration 24hr ICU Trauma score nursing manpower use
ODIN (organ of mortality and morbidity) score)
dysfunction and/or IRISS scoreGCS[8]
infection) MPM
(mortality prediction
models) LODS (logistic
organ dysfunction
system) [7]
SS (Multivariate sick-
ness score) TRIOS
(three days recalibe-
rated ICU outcome
score)
5. Review Article
Apollo Medicine, Vol. 8, No. 2, June 2011 92
• Complicated scoring system for specific element in
chronic disease
• Exaggerated penalty to old age
• Complexity ofAPACHE IV with 142 variables
SIMPLIFIED ACUTE PHYSIOLOGY SCORE SAPS
SAPS was first introduced in 1984 to simplify the
diagnostic data collection in APACHE, focusing only on
prognostic measures easily and typically measured in ICU
patients. The SAPS II was introduced in 1993 following a
combined European and NorthAmerican study.
• 12 physiological variables
• Age
• Type of admission (non-operative, emergency
surgery, elective surgery)
• Prior health record (AIDS, metastatic cancer
haematological cancer)
SAPS III has been recently introduced in 2005. It
includes the prognostic variables which were missing in
previous scores ( such as diagnostic information or
presence of infection). Access file with calculation sheet
and tables for data management are free and can be
downloaded from www.saps3.org
MORTALITY PROBABILITY MODELS II
MPM was developed by Stanely Lemeshow [10,11].
MPM II models were described in1993 to 1994 and
were based on same data as used for the SAPS II with
additional data from 6 other ICUs from North America. In
these models the final result is only given as the
probability of death rather than as a score. It comprised of
4 different models.
The MPM II admission (MPM0) computed within one
hour of admission to the ICU. The model contains 15
variables. It is the only general model which independent
of the treatment is provided in the ICU, so therefore can be
used for the patient stratification at the time of admission
to the ICU.
The MPM II 24 hr model - it comprises of 13 variables
collected 24hrs after the admission to the ICU.
MPM II 48hr model- it is computed after 48 hrs of ICU
admission.
MPM II 72 hrs model- computed at 72 hrs of the
admission to ICU.
All physiological variables are calculated based on the
worst values in the first 24 hrs of admission. MPM II 48hrs
and MPM II 72hrs models use the same variables as that of
MPM II 24hrs model, with different weight for risk of
death calculation. Both are based on the worst values
presented in the previous 24 hrs.
SEQUENTIAL ORGAN FAILURE ASSESMENT
The SOFA [12] was produced by a group from the
European Society of Intensive Care Medicine to describe
the degree of organ dysfunction associated with sepsis.
However, it has since been validated to describe the
degree of organ dysfunction in patient groups with organ
dysfunctions not due to sepsis. Six organ systems–
respiratory (paO2/fiO2), cardiovascular (blood pressure),
central nervous systems (GCS), renal (creatinine or urine
output), coagulation (platelet count), and liver (bilirubin)–
are weighted (each 1-4) to give a final score (6-24
(maximum)). SOFA, correlates well with both ICU and
non-ICU in-hospital mortality rates [13,14]. SOFAscoring
system has been proposed for use as a tool for triage in
mass casualty scenarios when critical care resources are
extremely taxed. [15]
Difference between commonly used scores and the
SOFAscore
Scoring systems SOFA score
Evaluate risk of mortality Evaluate morbidity
Aim- prediction Aim - description
Often complex Simple, easily calculated
Does not individualize Does individualize the
the degree of dysfunction/ degree of dysfunction/
failure of each organ failure of each organ
MULTIPLE ORGAN DYSFUNCTION SCORE
The MODS [16] scores six organ systems: respiratory
(Po2: FIO2 ratio in arterial blood); renal (measurement of
serum creatinine); hepatic (serum bilirubin
concentration); cardiovascular (pressure-adjusted heart
rate); haemato-logical (platelet count); and central
nervous system (Glasgow Coma Score) with weighted
scores (0-4) awarded for increasing abnormality of each
organ systems. Scoring is performed on a daily basis and
so allows a day-by-day prediction for patients.
CONCLUSION
APACHE II, APACHE III, SAPS II, MPM II give
comparable results.
All have good discrimination but poor calibration.
6. Review Article
93 Apollo Medicine, Vol. 8, No. 2, June 2011
Can be used to compare study population in RCTs,
assess ICU. While assessing the performance of the ICU
the results may be confounded by the fact that ICU that
admit sicker patients will have higher than predicted
mortality.
Patient selection can be made for therapeutic
interventions eg: indication for XIGRIS in severe sepsis
depending upon the number of organ system involvement.
Decision of withdrawal of support cannot be based in
the current scoring systems as their area under ROC
(discrimination) is far less than 0.99.
REFRENCES
1. Sage WM, Rosenthal MH, Silverman JF. Is intensive are
worth it? An assessment of input and outcome of critically
ill. Crit Care Med. 1986; 14: 777-782.
2. Detsky AS, Stricker SC, Mulley AG, Thibault GE.
Prognosis, survival and the expenditure of hospital
resourses for patients in intensive care unit. N Eng J
Med. 1981; 305: 667-672.
3. Selker HP. Systems for comparing actual and predicted
mortality rates; chaaterstics to promote cooperation in
improving hospital care. Ann Intern Med. 1993; 118: 820-
822.
4. Moreno R, Morais P. Outcome prediction in intensive
care. Results of prospective, multicenter, portuguese
study. Intensive care Med. 1997; 23:177-186 .
5. Le-Gall JR, Loirat P, Alperovitch A, et al. A simplified
acute physiology score for ICU. Cric care Med. 1984; 12:
975-977.
6. Lemeshow S, Teres D, Avurunin JS, Gage RW. Refining
intensive care outcome prediction using changing
probabilities of mortality. Criti Care.1988; 16: 470-477.
7. Le-Gall JR, Lemeshow S, saulneir F, Alberti C, Teres D.
For the ICU scoring group. The logistic organ dysfunction
system. A new way to assess organ dysfunction in ICU.
JAMA.1996; 276: 802-810.
8. Baker SP, O’ Neill B, Haddon W, Long WB. The injury
severity score, a method of describing patients with
multiple injuries and evaluating emergency care. J
Trauma. 1974; 14: 187-196.
9. Knaus WA, Draper EA, Wagner DP, Zimmerman JE.
“APACHE II: a severity of disease classification
system”. Critical Care Medicine.1985; 13(10): 818-829.
10. Lemeshow S, et al. Mortality probability models (MPM II)
based on an international cohort of intensive care
patients. JAMA. 1993; 270: 2478-2486.
11. Lemeshow S, Le Gall JR. Modeling the severity of illness
of ICU patients. A systems update. JAMA 1994;272:
1049-1055.
12. Vincent JL, Moreno R, Takala J, et al. The SOFA (Sepsis-
related organ failure assessment) score to describe
organ dysfunction/failure. On behalf of the Working
Group on Sepsis-Related Problems of the European
Society of Intensive Care Medicine. Intens Care
Med.1996; 22: 707-710.
13. Kajdacsy-Balla Amaral AC, Andrade FM, Moreno R, et al.
Use of the sequential organ failure assessment score as
a severity score. Intensive Care Med. 2005; 31: 243-
249.
14. Ferreira FL, Bota DP, Bross A, Melot C, Vincent JL. Serial
evaluation of the SOFA score to predict outcome in
critically ill patients. JAMA. 2001; 286: 1754-1758.
15. Centers for Disease Control and Prevention. Interim
guidance for protection of persons involved in U.S. Avian
influenza outbreak disease control and eradication
activities. http://www.cdc.gov/flu/avian/professional/
protect-guid.htm Accessed February 20, 2008.
16. Marshall JC, Cook DJ, Christou NV, Bernard GR, Spring
CL, Sibbald WJ. Multiple organ dysfunction score: a
reliable descriptor of a complex clinical outcome. Crit
Care Med. 1995; 23: 1638-1652.