SERIES ‘‘ATSERS TASK FORCE STANDARDISATION OF LUNGFUNCTION.docx

SERIES ‘‘ATS/ERS TASK FORCE: STANDARDISATION OF
LUNG
FUNCTION TESTING’’
Edited by V. Brusasco, R. Crapo and G. Viegi
Number 2 in this Series
Standardisation of spirometry
M.R. Miller, J. Hankinson, V. Brusasco, F. Burgos, R. Casaburi,
A. Coates,
R. Crapo, P. Enright, C.P.M. van der Grinten, P. Gustafsson, R.
Jensen,
D.C. Johnson, N. MacIntyre, R. McKay, D. Navajas, O.F.
Pedersen, R. Pellegrino,
G. Viegi and J. Wanger
CONTENTS
Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . 320
FEV1 and FVC manoeuvre . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . 321
Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . 321
Equipment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . 321
Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . 321

Display . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . 321
Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . 322
Quality control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . 322
Quality control for volume-measuring devices . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . 322
Quality control for flow-measuring devices . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . 323
Test procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . 323
Within-manoeuvre evaluation . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . 324
Start of test criteria. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . 324
End of test criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . 324
Additional criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . 324
Summary of acceptable blow criteria . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . 325
Between-manoeuvre evaluation . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . 325

Manoeuvre repeatability . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . 325
Maximum number of manoeuvres . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . 326
Test result selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . 326
Other derived indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . 326
FEVt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . 326
Standardisation of FEV1 for expired volume, FEV1/FVC and
FEV1/VC . . . . . . . . . . . . . . . . . . . . 326
FEF25–75% . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . 326
PEF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . 326
Maximal expiratory flow–volume loops . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . 326
Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . 326
Equipment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . 327
Test procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . 327

Within- and between-manoeuvre evaluation . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . 327
Flow–volume loop examples . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . 327
Reversibility testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . 327
Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . 327
Comment on dose and delivery method . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . 328
Determination of reversibility . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . 328
VC and IC manoeuvre . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . 329
Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . 329
AFFILIATIONS
For affiliations, please see
Acknowledgements section
CORRESPONDENCE
V. Brusasco
Internal Medicine

University of Genoa
V.le Benedetto XV, 6
I-16132 Genova
Italy
Fax: 39 103537690
E-mail: [email protected]
Received:
March 23 2005
Accepted after revision:
April 05 2005
European Respiratory Journal
Print ISSN 0903-1936
Online ISSN 1399-3003
Previous articles in this series: No. 1: Miller MR, Crapo R,
Hankinson J, et al. General considerations for lung function
testing. Eur Respir J 2005; 26:
153–161.
EUROPEAN RESPIRATORY JOURNAL VOLUME 26
NUMBER 2 319
Eur Respir J 2005; 26: 319–338

DOI: 10.1183/09031936.05.00034805
Copyright�ERS Journals Ltd 2005
c
VC and IVC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .329
IC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .329
Equipment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .329
Test procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .329
VC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .329
IC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .330
Use of a nose clip . . . . . . . . . . . . . . . . . . . . . . . . . . .330
Within-manoeuvre evaluation . . . . . . . . . . . . . . . . . . . . .330
Between-manoeuvre evaluation . . . . . . . . . . . . . . . . . . .330
Test result selection . . . . . . . . . . . . . . . . . . . . . . . . . . .330
Peak expiratory flow . . . . . . . . . . . . . . . . . . . . . . . . . . .330
Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .330
Equipment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .330
Test procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .330

Maximum voluntary ventilation . . . . . . . . . . . . . . . . . . .331
Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .331
Equipment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .331
Test procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .331
Technical considerations . . . . . . . . . . . . . . . . . . . . . . . .331
Minimal recommendations for spirometry systems . . . . . .331
BTPS correction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .332
Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .332
Test signals for spirometer testing . . . . . . . . . . . . . . . . .333
Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .333
Accuracy test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .333
Repeatability test . . . . . . . . . . . . . . . . . . . . . . . . . . . .333

Test signals for PEF meter testing . . . . . . . . . . . . . . . . .333
Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .333
Accuracy test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .333
Repeatability test . . . . . . . . . . . . . . . . . . . . . . . . . . . .334
Test signals for MVV testing . . . . . . . . . . . . . . . . . . . . . .334
Abbreviations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .334
Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .335
KEYWORDS: Peak expiratory flow, spirometry, spirometry
standardisation, spirometry technique, spirometry traning,
ventilation
BACKGROUND
Spirometry is a physiological test that measures how an
individual inhales or exhales volumes of air as a function of
time. The primary signal measured in spirometry may be
volume or flow.
Spirometry is invaluable as a screening test of general
respiratory health in the same way that blood pressure
provides important information about general cardiovascular
health. However, on its own, spirometry does not lead
clinicians directly to an aetiological diagnosis. Some indica-
tions for spirometry are given in table 1.
In this document, the most important aspects of spirometry are
the forced vital capacity (FVC), which is the volume delivered
during an expiration made as forcefully and completely as
possible starting from full inspiration, and the forced expira-

tory volume (FEV) in one second, which is the volume
delivered in the first second of an FVC manoeuvre. Other
spirometric variables derived from the FVC manoeuvre are
also addressed.
Spirometry can be undertaken with many different types of
equipment, and requires cooperation between the subject and
the examiner, and the results obtained will depend on
technical as well as personal factors (fig. 1). If the variability
of the results can be diminished and the measurement
accuracy can be improved, the range of normal values for
populations can be narrowed and abnormalities more easily
detected. The Snowbird workshop held in 1979 resulted in the
first American Thoracic Society (ATS) statement on the
standardisation of spirometry [1]. This was updated in 1987
and again in 1994 [2, 3]. A similar initiative was undertaken by
the European Community for Steel and Coal, resulting in the
first European standardisation document in 1983 [4]. This was
then updated in 1993 as the official statement of the European
Respiratory Society (ERS) [5]. There are generally only minor
differences between the two most recent ATS and ERS
statements, except that the ERS statement includes absolute
lung volumes and the ATS does not.
This document brings the views of the ATS and ERS together
in an attempt to publish standards that can be applied more
TABLE 1 Indications for spirometry
Diagnostic
To evaluate symptoms, signs or abnormal laboratory tests
To measure the effect of disease on pulmonary function

To screen individuals at risk of having pulmonary disease
To assess pre-operative risk
To assess prognosis
To assess health status before beginning strenuous physical
activity
programmes
Monitoring
To assess therapeutic intervention
To describe the course of diseases that affect lung function
To monitor people exposed to injurious agents
To monitor for adverse reactions to drugs with known
pulmonary toxicity
Disability/impairment evaluations
To assess patients as part of a rehabilitation programme
To assess risks as part of an insurance evaluation
To assess individuals for legal reasons
Public health
Epidemiological surveys
Derivation of reference equations

Clinical research
STANDARDISATION OF SPIROMETRY M.R. MILLER ET
AL.
320 VOLUME 26 NUMBER 2 EUROPEAN RESPIRATORY
JOURNAL
widely. The statement is structured to cover definitions,
equipment and patient-related procedures. All recording
devices covered by this statement must meet the relevant
requirements, regardless of whether they are for monitoring
or diagnostic purposes. There is no separate category for
‘‘monitoring’’ devices.
Although manufacturers have the responsibility for producing
pulmonary function testing systems that satisfy all the
recommendations presented here, it is possible that, for some
equipment, meeting all of them may not always be achievable.
In these circumstances, manufacturers should clearly identify
which equipment requirements have not been met. While
manufacturers are responsible for demonstrating the accuracy
and reliability of the systems that they sell, it is the user who is
responsible for ensuring that the equipment’s measurements
remain accurate. The user is also responsible for following
local law, which may have additional requirements. Finally,
these guidelines are minimum guidelines, which may not be
sufficient for all settings, such as when conducting research,
epidemiological studies, longitudinal evaluations and occupa-
tional surveillance.
FEV1 AND FVC MANOEUVRE
Definitions
FVC is the maximal volume of air exhaled with maximally

forced effort from a maximal inspiration, i.e. vital capacity
performed with a maximally forced expiratory effort,
expressed in litres at body temperature and ambient pressure
saturated with water vapour (BTPS; see BTPS correction
section).
FEV1 is the maximal volume of air exhaled in the first second
of a forced expiration from a position of full inspiration,
expressed in litres at BTPS.
Equipment
Requirements
The spirometer must be capable of accumulating volume for
o15 s (longer times are recommended) and measuring
volumes of o8 L (BTPS) with an accuracy of at least ¡3% of
reading or ¡0.050 L, whichever is greater, with flows between
0 and 14 L?s-1. The total resistance to airflow at 14.0 L?s-1
must
be ,1.5 cmH2O?L
-1
?s-1 (0.15 kPa?L-1?s-1; see Minimal recommen-
dations for spirometry systems section). The total resistance
must
be measured with any tubing, valves, pre-filter, etc. included
that may be inserted between the subject and the spirometer.
Some devices may exhibit changes in resistance due to water
vapour condensation, and accuracy requirements must be met
under BTPS conditions for up to eight successive FVC
manoeuvres performed in a 10-min period without inspiration
from the instrument.
Display
For optimal quality control, both flow–volume and volume–

time displays are useful, and test operators should visually
inspect the performance of each manoeuvre for quality
assurance before proceeding with another manoeuvre. This
inspection requires tracings to meet the minimum size and
resolution requirements set forth in this standard.
Displays of flow versus volume provide more detail for the
initial portion (first 1 s) of the FVC manoeuvre. Since this
portion of the manoeuvre, particularly the peak expiratory
flow (PEF), is correlated with the pleural pressure during the
manoeuvre, the flow–volume display is useful to assess the
magnitude of effort during the initial portions of the
manoeuvre. The ability to overlay a series of flow–volume
curves registered at the point of maximal inhalation may
be helpful in evaluating repeatability and detecting sub-
maximal efforts. However, if the point of maximal inhala-
tion varies between blows, then the interpretation of these
results is difficult because the flows at identical measured
volumes are being achieved at different absolute lung
volumes. In contrast, display of the FVC manoeuvre as a
volume–time graph provides more detail for the latter part
of the manoeuvre. A volume–time tracing of sufficient
size also allows independent measurement and calculation
of parameters from the FVC manoeuvres. In a display of
multiple trials, the sequencing of the blows should be
apparent to the user.
For the start of test display, the volume–time display should
include o0.25 s, and preferably 1 s, before exhalation starts
(zero volume). This time period before there is any change in
volume is needed to calculate the back extrapolated volume
(EV; see Start of test criteria section) and to evaluate effort
during the initial portion of the manoeuvre. Time zero, as
defined by EV, must be presented as the zero point on the
graphical output.

The last 2 s of the manoeuvre should be displayed to indicate a
satisfactory end of test (see End of test criteria section).
When a volume–time curve is plotted as hardcopy, the volume
scale must be o10 mm?L-1 (BTPS). For a screen display,
5 mm?L-1 is satisfactory (table 2).
The time scale should be o20 mm?s-1, and larger time
scales are preferred (o30 mm?s-1) when manual measure-
ments are made [1, 6, 7]. When the volume–time plot is used in
conjunction with a flow–volume curve (i.e. both display
methods are provided for interpretations and no hand
��
��
��
��
��
��
�
��
�
� �
�
��
��
��
��
��
��
��

��
��
��
�
��
��
��
��
��
FIGURE 1. Spirometry standardisation steps.
M.R. MILLER ET AL. STANDARDISATION OF
SPIROMETRY
c
EUROPEAN RESPIRATORY JOURNAL VOLUME 26
NUMBER 2 321
measurements are performed), the time scale requirement
is reduced to 10 mm?s-1 from the usually required minimum
of 20 mm?s-1 (table 2). The rationale for this exception is
that the flow–volume curve can provide the means for
quality assessment during the initial portion of the FVC
manoeuvre. The volume–time curve can be used to evaluate
the latter part of the FVC manoeuvre, making the time scale
less critical.

Validation
It is strongly recommended that spirometry systems should be
evaluated using a computer-driven mechanical syringe or its
equivalent, in order to test the range of exhalations that are
likely to be encountered in the test population. Testing the
performance of equipment is not part of the usual laboratory
procedures (see Test signals for spirometer testing section).
Quality control
Attention to equipment quality control and calibration is
an important part of good laboratory practice. At a minimum,
the requirements are as follows: 1) a log of calibration results
is maintained; 2) the documentation of repairs or other
alterations which return the equipment to acceptable opera-
tion; 3) the dates of computer software and hardware
updates or changes; and 4) if equipment is changed or
relocated (e.g. industrial surveys), calibration checks and
quality-control procedures must be repeated before further
testing begins.
Key aspects of equipment quality control are summarised in
table 3.
Calibration is the procedure for establishing the relationship
between sensor-determined values of flow or volume and the
actual flow or volume.
A calibration check is different from calibration and is the
procedure used to validate that the device is within calibration
limits, e.g. ¡3% of true. If a device fails its calibration check,
then a new calibration procedure or equipment maintenance is
required. Calibration checks must be undertaken daily, or
more frequently, if specified by the manufacturer.
The syringe used to check the volume calibration of spiro-
meters must have an accuracy of ¡15 mL or ¡0.5% of the full

scale (15 mL for a 3-L syringe), and the manufacturer must
provide recommendations concerning appropriate intervals
between syringe calibration checks. Users should be aware
that a syringe with an adjustable or variable stop may be out
of calibration if the stop is reset or accidentally moved.
Calibration syringes should be periodically (e.g. monthly) leak
tested at more than one volume up to their maximum; this can
be done by attempting to empty them with the outlet corked. A
dropped or damaged syringe should be considered out of
calibration until it is checked.
With regard to time, assessing mechanical recorder time scale
accuracy with a stopwatch must be performed at least
quarterly. An accuracy of within 2% must be achieved.
Quality control for volume-measuring devices
The volume accuracy of the spirometer must be checked at least
daily, with a single discharge of a 3-L calibrated syringe. Daily
calibration checking is highly recommended so that the onset of
a problem can be determined within 1 day, and also to help
define day-to-day laboratory variability. More frequent checks
may be required in special circumstances, such as: 1) during
industrial surveys or other studies in which a large number of
subject manoeuvres are carried out, the equipment’s calibration
should be checked more frequently than daily [8]; and 2) when
the ambient temperature is changing (e.g. field studies), volume
accuracy must be checked more frequently than daily and the
BTPS correction factor appropriately updated.
The accuracy of the syringe volume must be considered in
determining whether the measured volume is within accep-
table limits. For example, if the syringe has an accuracy of
0.5%, a reading of ¡3.5% is appropriate.
The calibration syringe should be stored and used in such a

way as to maintain the same temperature and humidity of the
testing site. This is best accomplished by keeping the syringe in
close proximity to the spirometer, but out of direct sunlight
and away from heat sources.
Volume-type spirometer systems must be evaluated for leaks
every day [9, 10]. The importance of undertaking this daily test
cannot be overstressed. Leaks can be detected by applying a
constant positive pressure of o3.0 cmH2O (0.3 kPa) with the
spirometer outlet occluded (preferably at or including the
mouthpiece). Any observed volume loss .30 mL after 1 min
indicates a leak [9, 10] and needs to be corrected.
TABLE 2 Recommended minimum scale factors for time,
volume and flow on graphical output
Instrument display Hardcopy graphical outputParameter
Resolution
required
Scale
factor
Resolution
required
Scale
factor
Volume# 0.050 L 5 mm?L-1 0.025 L 10 mm?L-1

Flow# 0.200 L?s-1 2.5 mm?L-1?s-1 0.100 L?s-1 5 mm?L-1?s-1
Time 0.2 s 10 mm?s-1 0.2 s 20 mm?s-1
#: the correct aspect ratio for a flow versus volume display is
two units of flow
per one unit of volume.
TABLE 3 Summary of equipment quality control
Test Minimum interval Action
Volume Daily Calibration check with a 3-L syringe
Leak Daily 3 cmH2O (0.3 kPa) constant pressure
for 1 min
Volume linearity Quarterly 1-L increments with a calibrating
syringe
measured over entire volume range
Flow linearity Weekly Test at least three different flow ranges
Time Quarterly Mechanical recorder check with stopwatch
Software New versions Log installation date and perform test
using
‘‘known’’ subject
STANDARDISATION OF SPIROMETRY M.R. MILLER ET
AL.

322 VOLUME 26 NUMBER 2 EUROPEAN RESPIRATORY
JOURNAL
At least quarterly, volume spirometers must have their
calibration checked over their entire volume range using a
calibrated syringe [11] or an equivalent volume standard. The
measured volume should be within ¡3.5% of the reading or
65 mL, whichever is greater. This limit includes the 0.5%
accuracy limit for a 3-L syringe. The linearity check procedure
provided by the manufacturer can be used if it is equivalent to
one of the following procedures: 1) consecutive injections of
1-L volume increments while comparing observed volume
with the corresponding cumulative measured volume, e.g. 0–1,
1–2, 2–3,…6–7 and 7–8 L, for an 8-L spirometer; and 2)
injection of a 3-L volume starting at a minimal spirometer
volume, then repeating this with a 1-L increment in the start
position, e.g. 0–3, 1–4, 2–5, 3–6, 4–7 and 5–8 L, for an 8-L
spirometer.
The linearity check is considered acceptable if the spirometer
meets the volume accuracy requirements for all volumes
tested.
Quality control for flow-measuring devices
With regards to volume accuracy, calibration checks must be
undertaken at least daily, using a 3-L syringe discharged at
least three times to give a range of flows varying between 0.5
and 12 L?s-1 (with 3-L injection times of ,6 s and ,0.5 s). The
volume at each flow should meet the accuracy requirement of
¡3.5%. For devices using disposable flow sensors, a new
sensor from the supply used for patient tests should be tested
each day.
For linearity, a volume calibration check should be per-

formed weekly with a 3-L syringe to deliver three relatively
constant flows at a low flow, then three at a mid-range flow
and finally three at a high flow. The volumes achieved at each
of these flows should each meet the accuracy requirement of
¡3.5%.
Test procedure
There are three distinct phases to the FVC manoeuvre, as
follows: 1) maximal inspiration; 2) a ‘‘blast’’ of exhalation; and
3) continued complete exhalation to the end of test (EOT).
The technician should demonstrate the appropriate technique
and follow the procedure described in table 4. The subject
should inhale rapidly and completely from functional residual
capacity (FRC), the breathing tube should be inserted into the
subject’s mouth (if this has not already been done), making
sure the lips are sealed around the mouthpiece and that the
tongue does not occlude it, and then the FVC manoeuvre
should be begun with minimal hesitation. Reductions in PEF
and FEV1 have been shown when inspiration is slow and/or
there is a 4–6 s pause at total lung capacity (TLC) before
beginning exhalation [12]. It is, therefore, important that the
preceding inspiration is fast and any pause at full inspiration
be minimal (i.e. only for 1–2 s). The test assumes a full
inhalation before beginning the forced exhalation, and it is
imperative that the subject takes a complete inhalation before
beginning the manoeuvre. The subject should be prompted to
‘‘blast,’’ not just ‘‘blow,’’ the air from their lungs, and then he/
she should be encouraged to fully exhale. Throughout the
manoeuvre, enthusiastic coaching of the subject using appro-
priate body language and phrases, such as ‘‘keep going’’, is
required. It is particularly helpful to observe the subject with
occasional glances to check for distress, and to observe the
tracing or computer display during the test to help ensure
maximal effort. If the patient feels ‘‘dizzy’’, the manoeuvre

should be stopped, since syncope could follow due to
prolonged interruption of venous return to the thorax. This is
more likely to occur in older subjects and those with airflow
limitation. Performing a vital capacity (VC) manoeuvre (see VC
and IC manoeuvre section), instead of obtaining FVC, may help
to avoid syncope in some subjects. Reducing the effort part-
way through the manoeuvre [13] may give a higher expiratory
volume in some subjects, but then is no longer a maximally
forced expiration. Well-fitting false teeth should not be
routinely removed, since they preserve oropharyngeal geome-
try and spirometry results are generally better with them in
place [14].
With appropriate coaching, children as young as 5 yrs of
age are often able to perform acceptable spirometry [15].
The technicians who are involved in the pulmonary function
testing of children should be specifically trained to deal
with such a situation. A bright, pleasant atmosphere,
TABLE 4 Procedures for recording forced vital capacity
Check the spirometer calibration
Explain the test
Prepare the subject
Ask about smoking, recent illness, medication use, etc.
Measure weight and height without shoes
Wash hands
Instruct and demonstrate the test to the subject, to include
Correct posture with head slightly elevated

Inhale rapidly and completely
Position of the mouthpiece (open circuit)
Exhale with maximal force
Perform manoeuvre …
Reference Ranges for Spirometry Across All Ages
A New Approach
Sanja Stanojevic1,2, Angie Wade1, Janet Stocks2, John
Hankinson3, Allan L. Coates4, Huiqi Pan1, Mark Rosenthal5,
Mary Corey4, Patrick Lebecque6, and Tim J. Cole1
1Medical Research Council Centre of Epidemiology for Child
Health, and 2Portex Respiratory Unit, University College
London Institute of Child
Health, London, United Kingdom; 3Hankinson Consulting,
Valdosta, Georgia; 4Department of Respiratory Medicine,
Hospital for Sick Children,
Toronto, Canada; 5Royal Brompton Hospital, London, United
Kingdom; and 6Pediatric Pulmonology and Cystic Fibrosis Unit,
Cliniques St. Luc,
Université Catholique de Louvain, Brussels, Belgium
Rationale: The Third National Health and Nutrition
Examination
Survey (NHANES III) reference is currently recommended for
inter-
preting spirometry results, but it is limited by the lack of

subjects
younger than 8 years and does not continuously model
spirometry
across all ages.
Objectives: By collating pediatric data from other large-
population
surveys, we have investigated ways of developing reference
ranges
that more accurately describe the relationship between
spirometric
lung function and height and age within the pediatric age range,
and
allow a seamless transition to adulthood.
Methods: Data were obtained from four surveys and included
3,598
subjects aged 4–80 years. The original analyses were sex
specific and
limited to non-Hispanic white subjects. An extension of the
LMS
(lambda, mu, sigma) method, widely used to construct growth
ref-
erence charts, was applied.
Measurements and Main Results: The extended models have
four
important advantages over the original NHANES III analysis as
follows:
(1) they extend the reference data down to 4 years of age, (2)
they
incorporate the relationship between height and age in a way
that is
biologically plausible, (3) they provide smoothly changing
curves to
describe the transition between childhood and adulthood, and
(4)
they highlight the fact that the range of normal values is highly
de-

pendent on age.
Conclusions: The modeling technique provides an elegant
solution to
a complex and longstanding problem. Furthermore, it provides a
bio-
logically plausible and statistically robust means of developing
contin-
uous reference ranges from early childhood to old age. These
dynamic
models provide a platform from which future studies can be
devel-
oped to continue to improve the accuracy of reference data for
pulmo-
nary function tests.
Keywords: spirometry; pulmonary function; reference values
Reference data are important for interpreting pulmonary func-
tion test results and can aid in the management of respiratory
diseases (1). As measurement techniques, equipment, and
popula-
tion characteristics evolve, there is a concurrent need to keep
reference data up-to-date to reflect these changes. Currently in
the United States, the American Thoracic Society (ATS)
guidelines
recommend the use of the Third National Health and Nutrition
Examination Survey (NHANES III) reference as the standard
for interpreting spirometry results in the U.S. population (2, 3).
The NHANES III dataset is one of the few references spanning
childhood and adulthood that is also nationally representative
and generalizable. However, the NHANES III reference is lim-
ited by a lack of subjects younger than 8 years, which often
results
in reference data being extrapolated to younger ages. Subbarao
and coworkers (4) have demonstrated the inaccuracies in inter-

preting results at younger ages using NHANES III, and the ATS
strongly discourages extrapolation of reference data beyond the
intended age/height range (3, 4). The alternative, to use pediat-
ric reference equations before switching to NHANES III, intro-
duces discontinuities between equations at the transition point
and can lead to further misinterpretation.
In recognizing these limitations, it may be reasonable to col-
late other available pediatric data to extend the NHANES III
reference to younger ages. Collation of previously collected
refer-
ence data has been shown to be a reasonable alternative to col-
lecting new data provided the data are available for reanalysis
and
that the datasets are relatively homogeneous (5). Such an
extended
dataset could also be reanalyzed in an attempt to model the
transition between childhood and adulthood continuously, tak-
ing into consideration the combined effects of height and age.
Additional inconsistencies in pediatric reference data include
the selection of explanatory factors used to predict lung
function.
There is no doubt that spirometric lung function is related to
height and, in adults, both FEV1 and FVC are known to
decrease
with age. However, in children, as a result of the growth
process,
age and height are highly correlated, thus some references have
chosen to omit age from prediction models. The studies that ad-
justed for both height and age in children tend to describe either
an absolute association (additive) (2, 6, 7) or a proportional one
(multiplicative) (8) and, in some cases, develop age-specific
equa-
tions (6, 9). Adjustment for age using a proportional model may
be especially important during periods of rapid growth, such as

during puberty when lung and somatic growth may not be syn-
chronized (8, 10).
AT A GLANCE COMMENTARY
Scientific Knowledge on the Subject
Accurate interpretation of lung function tests relies on ref-
erence ranges which distinguish the effects of disease from
growth and development. Limited data from young children
limit the accuracy with which early lung disease may be
identified.
What This Study Adds to the Field
These extended models provide more accurate reference
ranges for spirometry with transition into adulthood and also
incorporate age-related differences in between-subject vari-
ability, improving the definition of lower limits of normal.
(Received in original form August 23, 2007; accepted in final
form November 14, 2007)
Supported by Asthma UK (S.S.) and UK Medical Research
Council grant
G9827821 (T.J.C.).
Correspondence and requests for reprints should be addressed to
Sanja Stanojevic,
M.Sc., Portex Respiratory Physiology Unit, UCL Institute of
Child Health, 30
Guilford Street, London, WC1N 1EH UK. E-mail:
[email protected]

Am J Respir Crit Care Med Vol 177. pp 253–260, 2008
Originally Published in Press as DOI: 10.1164/rccm.200708-
1248OC on November 15, 2007
Internet address: www.atsjournals.org
Spirometric lung function is further complicated because the
variability of individual measurements around the median is not
uniform across the age/height range and there is skewness in the
distributions. This means that conventional multiple regression
analysis is not adequate to model the complex relationship be-
tween body size and lung function.
This study investigated ways to develop more appropriate ref-
erence ranges that could describe the relationship between lung
function and height and age more accurately within the pediatric
age range, while also including the adult age range and the tran-
sition between childhood and adulthood. Preliminary results
were
presented in abstract form at the 2007 ATS conference (11).
METHODS
Outcome Measures
We have chosen to focus on three spirometry outcomes: FEV1,
FVC,
and forced expiratory flow, midexpiratory phase (FEF25–75),
plus the
FEV1/FVC ratio. FEF25–75 is referred to as MMEF in some
centers, but

will be referred to here as the FEF25–75. The outcomes were
modeled in
terms of sex, age, and height.
Data Collation
To supplement the data from the NHANES III survey, the lead
authors
of several large-population surveys that had measured children
across
a wide age range, including those younger than 8 years, were
contacted to
obtain original data. Positive responses were limited to those
from indi-
viduals personally known to the authors.
Data from four surveys were obtained, as summarized in Table
1.
Briefly, the NHANES III is a large, representative, stratified,
random
survey of the U.S. population aged 8 to 80 years (2). For these
analyses,
the original data were reanalyzed to be consistent with the 1994
ATS
criteria. These data were supplemented with pediatric reference
data
published by Rosenthal and colleagues, who sampled children
aged 4 to
19 from 12 London schools in the early 1990s (7). This
reference is cur-
rently recommended by the British Thoracic Society for
children in the
United Kingdom. Additional pediatric data were available from
a Belgian
study (12), which used spirometry to validate the forced
oscillation tech-

nique in children aged 5 to 18 years. Finally, data from an older
Canadian
study (13), which measured healthy individuals aged 4 to 40
years to de-
velop prediction equations, were also included in the analysis,
of which
only the pediatric subset younger than 20 years was used to
increase the
proportion of children within the collated dataset (13).
Two-thirds of the original NHANES III population was of
African-
American or Mexican-American ethnic origin and
approximately one-
third of the Rosenthal data were nonwhite. Rosenthal and
colleagues (7)
classified nonwhite subjects as Afro-Caribbean, Oriental,
Middle Eastern,
Pakistani/Bangladeshi, and other; of these, only 39 subjects
were Afro-
Caribbean. The other two surveys were limited to non-Hispanic
white
subjects. Due to known ethnic and racial differences in lung
function,
models developed for this study were limited to data from non-
Hispanic
white subjects, but were subsequently compared with data from
African-
American and Mexican-American subjects. The ethnic and
racial data
currently available for these analyses were not sufficient to
develop eth-
nically and racially specific ‘‘all age’’ reference ranges that
would be robust
enough to apply in multiethnic populations,

To identify possible transcription errors, each dataset was
examined
individually for obvious outliers and impossible values.
Because these
data have been published previously, the datasets were
relatively clean
and very few data points (,1%) needed to be excluded.
Statistical Analysis
Conventional multiple regression analysis relies on four
assumptions: (1)
a linear relationship, (2) constant variability of values around
the mean
across the range of height and age, (3) a normally distributed
outcome
variable, and (4) that the combined effect of the covariates is
additive. In
the case of spirometric measures of lung function, these
assumptions are
rarely met.
The LMS (lambda, mu, sigma) method (14), widely used to con-
struct growth reference charts, is an extension of regression
analysis that
includes three components: (1) the median (mu), which
represents how
the outcome variable changes with an explanatory variable (e.g.,
height
or age); (2) the coefficient of variation (sigma), which models
the spread
of values around the mean and adjusts for any nonuniform
dispersion;
and (3) the skewness (lambda), which models the departure of
the vari-

ables from normality using a Box-Cox transformation.
Because height is not the only explanatory variable that needs
to be
considered, we applied the LMS method using the GAMLSS
package
(15) in the statistical program R (Version 2.4.1; R Foundation,
http://www.
r-project.org) to allow for modeling of more than one
explanatory
variable—in this case, height, age, and potential between-center
differ-
ences. Separate models were developed for males and females.
Smoothly
changing curves were applied to remove the effects of sampling
and mea-
surement variability across the height and age range without
distorting the
underlying relationship, which allowed the effects of height and
age to be
TABLE 1. CHARACTERISTICS OF DATA INCLUDED IN
ANALYSIS, WHICH WERE RESTRICTED TO NON-
HISPANIC WHITE SUBJECTS
First Author
(Reference) Country No. in Study
Non-Hispanic
White Subjects
Included in
Analysis

Age
Range (yr) Population Selection Exclusion Criteria Equipment
Hankinson (2)
NHANES III
U.S. 20,627 2,273 8–80 Random sample of the
U.S. population
living in households
Symptomatic, history
of smoking, ,2 acceptable
maneuvers
Dry rolling seal
spirometer
Rosenthal (7) U.K. 1,007 761 4–19 12 London schools Acute
respiratory illness in
the 3 wk before testing,
or symptoms suggesting
chronic respiratory disease
12-L dry rolling

seal
spirometer
(OHIO 840)
Lebecque (12) Belgium 377 316 5–18 Children in private
schools from
upper-middle-class
families
Prior history of wheezing
or use of bronchodilator,
chronic cough, exercise
intolerance, frequent or
severe upper respiratory
tract infections, use of
tobacco products, major
health problems (particularly
cardiac or thoracic surgery),
asthma of parents or siblings,
or a history of respiratory

infection during the month
before the study
Automated 8-L
water
sealed spirometer
(Eagle 1)
Corey (13) Canada 337 248 5–19 Not specified Lung disease
Wedge spirometer
254 AMERICAN JOURNAL OF RESPIRATORY AND
CRITICAL CARE MEDICINE VOL 177 2008
modeled as a smooth transition from adolescence into adulthood
as
a function of age. The goodness of fit was assessed using the
Schwartz
Bayesian criterion (SBC), which compares consecutive models
directly
while adjusting for the increased complexity to determine the
simplest
model with best fit. Any reduction of SBC is considered
technically
important, but in practice we chose to balance reductions with
clinical
relevance and biological plausibility. A detailed description of
the sta-

tistical methodology is planned.
The fitted model provides height/age/sex-specific values of the
three
elements of the distribution (median, coefficient of variation
(CV), and
skewness). The spirometry standard deviation (SD) is in the
same units
as the outcome (i.e., L or L/s), whereas the CV is defined as 100
3 (SD/
median). The median is the predicted value for the individual,
which,
together with the CV and skewness, allows the individual’s
measurement
to be converted to a z score; z scores are normally distributed
with a
mean of 0 and an SD of 1. The lower limit of normal is
officially defined
as the fifth percentile of the distribution that corresponds to a z
score of
21.64 (1, 3). We also discuss the clinical definition that
assumes a CV
of 10% and defines the normal range as two CVs on either side
of 100%
predicted (i.e., a normal range of 80 to 120%).
RESULTS
The models were based on 3,598 non-Hispanic white subjects
aged 4 to 80 years, 2,182 (60.5%) of whom were younger than
20
years and 271 (7.5%) of whom were younger than 8 years. A
description of the demographic characteristics of the study
popu-
lation can be found in Table 2. Children from the NHANES III
reference were taller (height-for-age z score) (16) and heavier

(weight-for-age z score) (data not shown) than children in the
other three datasets. On the basis of the original distributions of
each of the three outcomes (FEV1, FVC, and FEF25–75)
studied,
height and age were nonlinear, the spread of values around the
mean was nonuniform for both height and age, and there was
also evidence that the distributions of all three outcomes were
skewed. FEF25–75 results were not available for the British
data,
so models for these data are based on 700 fewer subjects.
Median Trends
The models for all three outcomes were dependent on height and
age, and logarithmic transformation of both the outcome and
explanatory variables was necessary. The resulting models de-
scribe a multiplicative and allometric height relationship, where
all three spirometric outcomes are proportional to height raised
to the power 2.5. For example, a 1% increase in height corre-
sponds to a 2.5% increase in spirometry. The median volumes
for
each of the outcomes, smoothed by age, are presented in Figure
1.
Despite age and height being highly correlated, there was a sig-
nificant and independent effect of age after adjusting for height
(Figure 2).
Variability
The LMS method also quantifies the spread of values around the
median, which is essential information when determining the
range of expected lung function values in a normal population.
After adjustment for the effects of height and age, the between-
subject variability, characterized by the CV, demonstrated
impor-
tant age-related trends (Figure 3). The between-subject

variability
was highly age dependent, being greatest in children younger
than
11 years and increasing steadily with increasing age in adults
after
the age of 30. The variability of FEF25–75 was noticeably
larger
than for FEV1 and FVC. The commonly quoted ‘‘normal range’’
of 80 to 120% predicted assumes a CV of 10%; however, as can
be
seen from Figure 3, even for FVC, this only occurs over a
limited
age range of 15 to 35 years. By contrast, at 5 to 6 years of age,
the
CV for FEV1 and FVC is 15%, corresponding to a normal range
of 70 to 130% predicted. The CV for FEF25–75 at age 5 to 6
years
is 20%, corresponding to 60 to 140% predicted, and by age 50,
the CV for FEF25–75 has widened to 30%, a normal range of 40
to
160%.
Skewness
After adjustment for height and age, there was little evidence of
skewness for FEV1 and FVC. By contrast, there was significant
skewness in FEF25–75 and the FEV1/FVC ratio for both sexes,
which was incorporated into the prediction models.
The age-related changes in FEV1 and FVC were accompanied
by age-related changes in the ratio (FEV1/FVC) (Figure 4). As
can be seen, the frequently quoted predicted FEV1/FVC of 0.7
is
not in fact attained until around 50 years of age in males and
considerably later in females, being noticeably higher during
childhood and lower in the elderly. The range of ‘‘normal

values’’
for this ratio is age dependent, being wider in both the young
and
the elderly, and sex differences are apparent, with females
having
greater predicted values of FEV1/FVC than males at all ages
and
which are most marked in late puberty (Figure 4).
Between-Center Differences
The models were further explored by evaluating the extent to
which between-center differences affected the expected refer-
ence range. After adjustment for height and age, there were
small
but significant between-center differences in FEV1 for both
males
and females and in FVC for females. Compared with NHANES
III, median values from Lebecque (12) and Corey (13) were 2 to
3% greater after adjustment, whereas those from Rosenthal and
colleagues (7) were approximately 4% smaller. Interestingly, no
between-center differences were observed for FVC in males or
for FEF25–75 in either sex.
TABLE 2. SUMMARY OF DATA INCLUDED IN ANALYSIS
No. Male % (n) Female % (n)
Age (yr) Height (cm) Height-for-Age z Score*
Mean (SD) Range Mean (SD) Range Mean (SD) Range
Total 3,598 45 (1,621) 55 (1,977) 16† 4.0 to 80 156.7 (17.6)
107.4 to 206.5 0.33 (1.0) –4.1 to 4.1
NHANES (2)

Adult 1,552 35.2 (546) 64.8 (1,006) 45.0 (19.1) 17 to 80 167.4
(9.8) 144.7 to 206.5 NA NA
Pediatric 721 48.6 (350) 51.5 (371) 11.9 (2.5) 8.0 to 16.9 150.2
(15.1) 117.0 to 191.9 0.52 (1.1) 24.1 to 4.1
Rosenthal (7) 761 59.0 (453) 41.0 (315) 11.7 (3.7) 4.4 to 18.8
148.0 (20.0) 107.4 to 191.6 0.18 (0.9) 23.1 to 3.8
Lebecque (12) 316 52.5 (166) 47.5 (150) 12.0 (3.5) 5.0 to 18.9
149.0 (18.1) 112.0 to 186.0 0.21 (0.9) 22.8 to 3.1
Corey (13) 248 44.4 (110) 55.7 (138) 11.1 (3.5) 4.0 to 18.0
144.7 (17.8) 109.0 to 183.0 0.26 (1.1) 22.4 to 3.2
Definition of abbreviations: NA 5 not applicable; NHANES 5
National Health and Nutrition Examination Survey.
* z scores calculated using the British 1990 growth reference
charts (16).
† Median age: Age was not normally distributed, therefore
median is presented instead of mean.
Stanojevic, Wade, Stocks, et al.: Pediatric Reference Ranges for
Spirometry 255
Ethnic Differences
The NHANES III African-American subjects had considerably
lower FEV1 and FVC, but similar flows and FEV1/FVC
compared

with non-Hispanic white subjects (Figure 5). With the exception
of FVC in females, Mexican Americans had similar values to
non-
Hispanic whites. Ethnic and racial differences varied according
to
sex, generally being more marked in females. Of significance is
that the standard deviations for each of the sex-specific ethnic
z scores were approximately 1, which could facilitate develop-
ment of race- and sex-specific adjustment factors to account for
the shift in values.
Comparison with the Original NHANES III Equations
Figure 6 compares the current model with the original NHANES
III equations in terms of the median and the lower limit of
normal.
Although the new model is not dramatically different from the
original, three major advantages of the current approach can be
seen. First, the current models extend the reference down to 4
years of age, thereby improving the accuracy with which normal
values can be predicted in very young children; it can be seen
that
the original NHANES III equations underpredict lung function
in healthy children younger than 10 years and therefore fail to
identify early lung disease. Second, smoothly changing curves
describe the transition between childhood and early adulthood.
Third, the age-dependent between-subject variability is quanti-
fied, thereby allowing improved precision with which to define
the
lower limits of normal at all ages.
Reference Equations
The methods used do not produce equations per se but compre-
hensive look-up tables that can be applied in a Microsoft Excel
add-in module. The module can be found at www.growinglungs.

org.uk (Pediatric Reference Ranges for Spirometry). The
module
can also be easily implemented into current commercial spiro-
meters, upon request by manufacturers. The program facilitates
prospective interpretation of a single observation or
retrospective
analysis of an entire dataset to calculate z scores, % predicted,
or
centiles.
DISCUSSION
This study presents a new approach to modeling spirometry
data, which produces ‘‘all age’’ reference curves using a single,
smoothly age-changing model to explain the complex relation-
ship between lung function and height and age during puberty
and
early adulthood. In addition, the dataset extends the NHANES
III reference to include children as young as 4 years and uses
age-
dependent between-subject variability to establish the lower
limits of normal. This study also confirms previous observations
regarding the rapid decrease in the FEV1/FVC ratio with age
(17,
18). In Figure 3, we demonstrate that the ratio is not fixed at
0.70,
as recommended by the Global Initiative for Chronic
Obstructive
Lung Disease (19, 20), but is markedly age dependent. The
initial
high values reflect the relatively large airways in relation to
lung
volumes in early life, which are associated with a short
expiratory
time constant and rapid lung emptying, whereas during adoles-

cence, the rapid decline in FEV1/FVC probably reflects the
differ-
ent rates of lung and airway growth (dysanaptic growth) during
this period, which may be particularly marked in males, in
whom
lung growth continues for several years after somatic growth
has
ceased (8, 17, 18).
A key feature of this study is the proportional model that
adjusts for measures of body size and age in a way that is
biologi-
cally plausible, where the nonlinear height relationship
approximates
Figure 1. The median volumes for each out-
come, smoothed by age. Note the relatively
higher FEF25–75 compared with FVC and FEV1
in females compared with males.
256 AMERICAN JOURNAL OF RESPIRATORY AND
CRITICAL CARE MEDICINE VOL 177 2008
the three-dimensional shape of the chest. Inclusion of an age
adjustment in addition to height allows the complex changes
during puberty to be accounted for without the need to
undertake
pubertal staging, which may be impractical in many clinical and
research settings.
The ethnic/race trends in Figure 5 complement those described

in the original analysis of the NHANES III dataset and high-
light the fact that ethnic/race adjustments are complex and in-
consistent, such that using the same adjustment factor for both
sexes and for all outcomes, as commonly reported (i.e., 12%), is
unlikely to be appropriate (3). For these analyses, we did not
have access to sufficient ethnicity- and race-specific data in
children younger than 8 years or for other groups not included
in the NHANES III dataset. Having now established suitable
modeling techniques to allow development of all-age reference
ranges, a further initiative will be required to collate more
ethnicity- and race-specific spirometric data from healthy
children (especially those younger than 8 yr) and adults, so
that the exercise can be extended for multiethnic application.
Implications for Practice
In addition to allowing more accurate predictions of expected
val-
ues in younger children and a smooth transition between
pediatric
and adult reference data, the ability to quantify the age/height-
adjusted between-subject variability has major implications for
defining clinical thresholds of normal. Respiratory clinicians
are
familiar with expressing lung function results as % predicted
(i.e.,
100 *[observed/predicted]), where the predicted value comes
from a reference equation incorporating sex, age, and height. A
value of 100% predicted represents the median reference value,
with a range of values around the median indicating between-
subject variability. For FEV1 and FVC, this variability has con-
ventionally been taken to be a CV of 10% and the normal range
is
62 CVs of the median (i.e., 80–120%). This is valid as long as
the
CV is genuinely 10%. However, when actual variability of the

three spirometric outcomes is plotted as a function of age
(Figure
4), it can be seen that a CV of 10% is only observed over a
narrow
age range of between 15 and 35 years. In younger children and
older adults, the CV approaches 15% for FEV1, which extends
the normal range to 70 to 130%. Given this wider range of
normal
values in younger and older subjects, age-specific cutoffs for
the
lower limit of normal are essential because failure to account
for
this increased variability will incorrectly flag individuals as
‘‘abnor-
mal.’’ This problem is exacerbated by the differences in
between-
subject variability between different spirometric outcomes.
An alternative approach is to express results in terms of a
z score rather than % predicted, because z scores combine the
% predicted and CV into a single number: z score 5 (% pre-
dicted – 100)/CV. Regardless of the CV, the range of normal
values is consistent as the z score changes in relation to the CV.
Thus, although 80% predicted represents the lower limit of nor-
mal when the CV is 10%, it is well within the normal range if
the
CV is greater than this. This further emphasizes the need to
know
the between-subject CV for each outcome if results that have
been
expressed as % predicted are to be interpreted correctly.
Although the ATS statement on interpretation of lung func-
tion tests does not explicitly recommend the use of z scores, it
does state that the lower limit of normal corresponds to the fifth

percentile of the frequency distribution (3). When data are nor-
mally distributed, z scores correspond directly to percentiles
such that a z score of 21.64 is equivalent to the fifth percentile
(1). Thus, the lower limit of normal can be defined as % pre-
dicted 21.64 3 CV. Regardless of whether z scores, percentiles,
or % predicted are used to interpret results, it is imperative to
consider the between-subject CV when determining the lower
limit of normal.
Strengths and Limitations
We have demonstrated that it is possible to collate data from
more than one center, and have established a foundation on
which
Figure 2. Height trends in FEV1 at eight specific ages dem-
onstrate that, for any given height, age is as important to
consider in determining the reference range, especially dur-
ing puberty. In contrast to adulthood, where there is a de-
cline with age, throughout childhood at any given height an
older subject can be expected to have higher values of lung
function. This effect is most marked during puberty.
Stanojevic, Wade, Stocks, et al.: Pediatric Reference Ranges for
Spirometry 257
larger international and more comprehensive datasets can be
built. The small differences observed between centers could be
due to equipment or software differences, measurement

technique,
and/or true population differences. For instance, the Canadian
data are more than 30 years old and there may be differences in
population characteristics, such as timing of puberty. It is re-
markable that, despite the cohort effects, the differences
between
centers were minimal and not likely to be clinically important.
Ideally, each center should develop its own reference ranges;
however, in practice, this is rarely feasible (22, 23). In effect,
this
combined dataset describes a typical center, trading off a slight
reduction in precision, due to the increased between-center vari-
ability, against a reduction in bias. Combining data from more
than one center provides a possible way forward to address the
ongoing practical problem of applying reference data in centers
that lack their own reference. Nevertheless, centers should con-
tinue to validate reference equations with a sample of healthy
con-
trol subjects from their own population to test for any
systematic
Figure 3. Between-subject variability, ex-
pressed as the coefficient of variantion (CV)
for each of the three spirometric outcomes.
A CV of 10% corresponds to a normal
range of 80 to 120% predicted. As can be
seen, the CV for FVC and FEV1 is near …
Disclaimer: This is a machine generated PDF of selected
content from our databases. This functionality is provided

solely for your
convenience and is in no way intended to replace original
scanned PDF. Neither Cengage Learning nor its licensors make
any
representations or warranties with respect to the machine
generated PDF. The PDF is automatically generated "AS IS"
and "AS
AVAILABLE" and are not retained in our systems. CENGAGE
LEARNING AND ITS LICENSORS SPECIFICALLY
DISCLAIM ANY
AND ALL EXPRESS OR IMPLIED WARRANTIES,
INCLUDING WITHOUT LIMITATION, ANY WARRANTIES
FOR AVAILABILITY,
ACCURACY, TIMELINESS, COMPLETENESS, NON-
INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A
PARTICULAR
PURPOSE. Your use of the machine generated PDF is subject to
all use restrictions contained in The Cengage Learning
Subscription
and License Agreement and/or the Gale Academic OneFile
Terms and Conditions and by using the machine generated PDF
functionality you agree to forgo any and all claims against
Cengage Learning or its licensors for your use of the machine
generated
PDF functionality and any output derived therefrom.
Spirometry in Primary Care Practice(*)
Authors: Tam Eaton, Steve Withy, Jeffrey E. Garrett, Jill
Mercer, Robert M. L. Whitlock and Harry H. Rea
Date: Aug. 1999
From: Chest(Vol. 116, Issue 2)
Publisher: Elsevier B.V.
Document Type: Article

Length: 4,638 words
The Importance of Quality Assurance and the Impact of
Spirometry Workshops
Objective: To determine the quality of spirometry performed in
primary care practice and to assess the impact of formal
training.
Design: Randomized, controlled prospective interventional
study.
Setting: Primary care practice, Auckland City, New Zealand.
Participants: Thirty randomly selected primary care practices
randomized to "trained" or "usual" groups. One doctor and one
practice nurse were nominated to participate from each practice.
Interventions: "Trained" was defined as participation in an
"initial" spirometry workshop at week 0 and a "maintenance of
standards" workshop at week 12. "Usual" was defined as no
formal training until week 12, when participants they attended
the
same "initial" workshop provided for the trained group. The
study duration was 16 weeks. Each practice was provided with a
spirometer to be used at their clinical discretion.
Measurements and results: Spirometry data were uploaded
weekly and analyzed using American Thoracic Society (ATS)
criteria for acceptability and reproducibility. The workshops
were assessed objectively with practical and written
assessments,
confirming a significant training effect. However, analysis of
spirometry performed in clinical practice by the trained
practitioners
revealed three acceptable blows in only 18.9% of patient tests.

In comparison, 5.1% of patient tests performed by the usual
practitioners had three acceptable blows (p [is less than]
0.0001). Only 13.5% of patient tests in the trained group and
3.4% in
the usual group (p [is less than] 0.0001) satisfied full
acceptability and reproducibility criteria. However, 33.1% and
12.5% of
patient tests in the trained and usual groups, respectively (p [is
less than] 0.0001), achieved at least two acceptable blows, the
minimum requirement. Nonacceptability was largely ascribable
to failure to satisfy end-of-test criteria; a blow of at least 6 s.
Visual inspection of the results of these blows as registered on
the spirometer for the presence of a plateau on the volume-time
curve suggests that [is less than] 15% were acceptable.
Conclusions: Although a significant training effect was
demonstrated, the quality of the spirometry performed in
clinical practice
did not generally satisfy full ATS criteria for acceptability and
reproducibility. Further study would be required to determine
the
clinical impact. However, the ATS guidelines allow for the use
of data from unacceptable or nonreproducible maneuvers at the
discretion of the interpreter. Since most of the failures were
end-of-test related, the [FEV.sub.1] levels are likely to be valid.
Our
results serve to emphasize the importance of effective training
and quality assurance programs to the provision of successful
spirometry in primary care practice. (CHEST 1999; 116:416-
423)
Key words: primary care practice; quality assurance;
randomized controlled; spirometry; spirometry workshops

Abbreviations: ATS = American Thoracic Society; PEF = peak
expiratory flow
Spirometry is pivotal to the screening, diagnosis, and
monitoring of respiratory disease and is increasingly advocated
in primary
care practice. Earlier this year, CHEST published a
comprehensive supplement entitled "Strategies in Preserving
Lung Health
and Preventing COPD and Associated Diseases."[1] This major
new initiative is known as the National Lung Health Education
Program and is directed at primary care physicians. The
campaign is clearly underpinned by spirometry. COPD is a
leading
cause of both morbidity and mortality, not only in the United
States but also in most developed countries, including New
Zealand.[2,3] However, airflow obstruction is also a marker of
increased risk of death from heart disease, lung cancer, and
stroke.[4-6] A recent consensus statement of the European
Respiratory Society emphasized the importance of spirometry in
allowing the early diagnosis of COPD in asymptomatic
patients.[7] Smoking cessation strategies then may be more
appropriately directed, potentially yielding an immense public
health gain.
Although spirometry is often described as a simple screening
test, due consideration is essential not only of equipment
selection, but, importantly, of test performance and correct
interpretation of the results. In 1991, the American Thoracic
Society
(ATS) Statement on Lung Function Testing stated: "The largest
single source of within subject variability is improper
performance of the test."[8] This was persuasively addressed in
the Lung Health study, particularly with regard to the
importance of ongoing maintenance of standards.[9,10] Hence,

effective training and quality assurance are vital prerequisites
for successful spirometry.[11] While well-established criteria
for acceptability and reproducibility have been widely
disseminated, it is by no means certain that these are adhered to
in clinical practice. Excepting research studies and accredited
pulmonary function laboratories,[12] there are no formal quality
assurance programs in place. Quality assurance is crucial to
prevent misleading results and misdiagnoses. If spirometry is to
be promoted as a screening tool in primary care practice, it is
important that careful attention is paid to ensuring that quality
standards are met.
No previous study has formally assessed spirometry
performance in primary care practice. The development of
"smart"
spirometers has enabled the quality of spirometry performed in
clinical practice to be objectively assessed using the ATS
criteria for acceptability and reproducibility.[11] We aimed to
determine both the quality and the impact of training on
spirometry
performed in primary care practice.
MATERIALS AND METHODS
Participants
A randomized, controlled prospective study was performed to
examine the quality of spirometry in primary care practice.
Practitioners were invited to participate by an introductory
letter, sent to 301 primary care practices; 119 of 301 practices
(40%)
accepted. Thirty of 119 practices (25%), each nominating one
doctor and one nurse, were randomly selected to make up the
final study group. Local ethics committee approval was
obtained.

Three separate evaluations were performed:
1. Spirometry quality (using ATS criteria) was examined in all
patient tests (n = 1,012);
2. Practical and written assessments were used to quantify the
training effect of the spirometry workshops; and
3. Indications for (n = 580) and interpretations of (n = 559)
spirometry were inspected using a randomly selected subgroup.
Study Design
As shown in Figure 1, practices were randomly assigned to
spirometry training ("trained" group, n = 15) or to the
performance
of spirometry without prior instruction ("usual" group, n = 15).
In the trained group, the doctors and practice nurses attended an
"initial" spirometry workshop at week 0 and a further
"maintenance of standards" workshop at week 12.
In the usual group, the spirometer was delivered to the doctor
and nurse with instructions for its operation, but no training in
spirometry performance was given. At week 12, they attended
the same spirometry workshop provided in week 0 for the
trained group.
Spirometry Workshops
The initial workshop was 2 h in duration and comprised
theoretical and practical aspects of spirometry performance,
with
particular attention paid to acceptability and reproducibility

criteria and the importance of quality assurance (see
"Appendix").
The spirometry workshops were led by the clinical director of
our pulmonary function laboratory, two consultant
pulmonologists,
a clinical respiratory scientific officer, and charge respiratory
technician. All had [is greater than] 15 years of experience in
the
field. This workshop was held at week 0 for the trained group
and week 12 for the usual group. The trained group received a
90-rain "maintenance of standards" workshop at week 12.
Quality assurance was emphasized.
Following the workshops, the results of the written and
practical assessments and feedback on the spirometry they had
performed in the previous 12 weeks (provisionally analyzed for
acceptability and reproducibility) were discussed individually
with each practitioner.
Spirometers
Each practice was provided with a handheld spirometer (2120
U; Vitalograph Limited; Buckingham, UK) certified by Dr. R.O.
Crapo, MD (Medical Director of the Pulmonary Laboratory at
LDS Hospital, Salt Lake City, UT) as meeting ATS
standards.[11]
This unit uses a pneumotachograph flow sensor. A perceived
advantage of the unit was the provision of "built-in" quality
assurance features, based on ATS criteria.[11] Prompts were
displayed after each blow for an unacceptable blow (eg, slow
start, or abrupt end), for recommending the performance of
three or more blows, and giving the variability between the two
largest values for [FEV.sub.1] and for FVC.
Data Collection

During the study period, only the nominated doctor and/or
practice nurse from each practice used the spirometer. Both
groups
were provided with the clinical indications for spirometry.[13]
They were advised to use the spirometer entirely at their
clinical
discretion. The spirometer had a 100-test memory, allowing data
to be uploaded at weekly intervals; in this way any study
effect on the generation of spirometry by the participants was
minimized. A unique patient identification code was entered.
Standard demographic data (age, sex, height, and ethnicity)
were entered with the derivation of normal predicted
values.[14,15] For study purposes, only expiratory parameters
were measured: [FEV.sub.1], FVC, peak expiratory flow rate,
and forced expiratory flow (midexpiratory phase).
Primary Outcome Assessments
Spirometry Quality Assurance: All the spirograms generated by
the practitioners during the study period were analyzed for
acceptability and reproducibility as specified by the ATS
quality criteria.[11]
Acceptability Criteria: Individual spirograms were judged
"acceptable" if all of the following were satisfied: good start (as
defined by an extrapolated volume [is less than] 5% of FVC or
150 mL, whichever is greater); satisfactory exhalation for [is
greater than or equal to] 6 s; free from abrupt end; and free
from cough.
Reproducibility Criteria: After three acceptable spirograms
were identified, the following tests were applied. Were the two
largest FVC values within 200 mL of each other? Were the two
largest [FEV.sub.1] values within 200 mL of each other? If both
these criteria were satisfied, the test was judged reproducible.

"Good start" has also been defined by time-to-peak expiratory
flow (PEF). Time-to-PEF is not currently a standard
recommendation in the ATS criteria[11] and, hence, was not
used in the overall analysis of acceptability. However, the
spirometry unit used in this study incorporated quality prompts
with poor start (defined as a time-to PEF [is greater than] 85
ms), as did the Lung Health study.[9]
Since blows of [is less than] 6 s may be acceptable if a plateau
is reached, a random selection of these blows was scrutinized
by two experienced pulmonologists. All blows [is less than] 4 s
were automatically judged nonacceptable. A plateau was
defined as no visible change in volume for at least 1 s.
Practical and Written Spirometry Assessments: Both groups
were formally assessed on practical performance of spirometry
at
week 19. (see "Appendix"). Each practitioner was asked to
perform spirometry on a "naive subject" and was scored on a
scale
of 1 to 10 by one of five trained examiners. Although the
examiners were not blinded, scoring bias was minimized by
using
stringent objective criteria. A score of 8 was judged
"acceptable." The trained group completed a written assessment
before
and immediately after the week 0 workshop. The same
assessment was repeated before the week 12 workshops in both
groups. The assessment included only material presented at the
workshop. We developed a video of "poorly performed"
spirometry containing five errors to be identified. The written
and practical assessments were fed back to the trained group at
week 12 (ie, the formative assessment).

Indications for and Interpretation of Spirometry: At the end of
the study, each practice was provided with 25 patient
identifications, randomly selected from their patients who had
performed spirometry in the preceding 16 weeks. Patient
demographic data were recorded, and the doctor was asked to
specify the indication for spirometry (eg, screening of an
asymptomatic smoker). For each spirometric record, the doctor
was asked to provide an interpretation, choosing from the
following possibilities: normal, early small airways disease,
obstruction, restriction, mixed obstruction/restriction, inability
to
interpret due to inadequate spirometry, and other. These records
were then reviewed by two experienced pulmonologists. The
same information was provided as was available to the primary
care physician; patient demographic data, "indication," forced
expiratory indexes with normal predicted values, the expiratory
curve, and the acceptability and reproducibility prompts.
Primary care physician interpretations were then marked as
"correct" or "incorrect."
Data were normally distributed and were presented as mean
(SD). Proportions were compared by Fisher's Exact Test.
Changes in scores were analyzed using paired Student's t test.
Logistic regression was used to explore predictors of
"acceptable" spirometry. A p value [is less than] 0.05 was
considered significant. All analyses were performed on a
personal
computer (IBM-compatible) using appropriate software (SAS,
version 6.1 for Windows; SAS Institute; Cary, NC).
RESULTS
A total of 1,012 patient tests (2,928 blows) were performed. The

mean number (SD) of patient tests per practice per week was
2.3 (2.4). Patient demographic data included the following:
mean age, 46.0 years (range, 5 to 90 years); and patients were
equally divided by gender (male/female ratio, 1:0.98). The
majority of patients were either white (83%) or Maori/Pacific
Islander
(12.5%), reflecting the ethnic distribution in the Auckland
region.
Spirometry Quality Assurance
Spirometry quality for weeks 0 to 11 is shown in Table 1.
Patient tests satisfied ATS criteria for acceptability and
reproducibility
more frequently in the trained group than in the usual group (p
[is less than] 0.0001). Nonacceptability was largely ascribable
to
failure to satisfy end-of-test criteria (Table 2). From a total of
2,928 blows, 825 (28%) were [is greater than] 6 s, and 1,380
(47%) were [is less than] 4 s and were rejected automatically.
The remaining 723 blows (25%) of 4 to [is less than] 6 s
duration
may have been acceptable on further scrutiny. Visual inspection
of 245 blows (33%) demonstrated a plateau in 90 (37%). This
finding suggests that [is less than] 15% of blows that were
judged unacceptable on the grounds of inadequate duration were
acceptable. Individual blows were more often unacceptable in
the usual group (p [is less than] 0.0001); however, following the
week 12 workshop, a clear training effect was observed (p [is
less than] 0.0001) (Fig 2). The proportion of acceptable blows
was then consistent with the trained group. Lung function and
spirometry quality assurance data were tabulated by age (Table
3). On logistic regression, the major determinant of an
acceptable maneuver was "training" (p = 0.0001). Blows were
more
often unacceptable at the extremes of age ([is less than] 10

years and [is greater than] 80 years; p = 0.01), in women (p =
0.05), and in non-Europeans (p = 0.006).
Data Trained Usual
No. of patient tests 275 471
No. of blows 923 1,285
Patient tests
3 blows 242 (88.0) 369 (78.3)
[is greater than or
equal to] 3 acceptable blows 52 (18.9) 24 (5.1)
[is greater than or
equal to] reproducible blows 37 (13.5) 16 (3.4)
[is greater than or
equal to] 2 acceptable
blows([dagger]) 91 (33.1) 59 (19.5)
Data p Value
No. of patient tests
No. of blows
Patient tests
3 blows 0.0008
[is greater than or
equal to] 3 acceptable blows <0.0001
[is greater than or
equal to] reproducible blows <0.0001
[is greater than or
equal to] 2 acceptable
blows([dagger]) <0.0001
Data Trained Usual
No. of blows 923 1,285

Blows with poor start([dagger]) 32 (3) 32 (2)
Blows with unsatisfactory exhalation
Duration < 6 s 565 (61) 1,078 (84)
Abrupt end 263 (28) 674 (52)
Blows with cough 16 (2) 26 (2)
Blows with poor start([dagger]) 296 (32) 461 (36)
Data p Value
No. of blows
Blows with poor start([dagger]) NS
Blows with unsatisfactory exhalation
Duration < 6 s <0.0001
Abrupt end <0.001
Blows with cough NS
Blows with poor start([dagger]) NS
Table 1--Spirometry Quality Assurance Data From Weeks 0 to
11 Comparing Trained With Usual Practitioners(*)
(*) Values given as No. (%), unless otherwise indicated.
([dagger]) The ATS statement specifically states that "the only
criterion for unacceptable subject performance is fewer than
two
acceptable curves."(11)
Table 2--Spirometry Quality Assurance Data From Weeks 0 to
11 and Reasons for Lack of Acceptability(*)
No. of
Age, yr Patients [FEV.sub.1], %pred [FVC.sub.1],
%pred

< 10 50 85.8 (16.4) 87.9 (16.8)
10-19 163 87.7 (19.4) 94.3 (20.8)
20-29 240 97.7 (21.5) 97.7 (22.3)
30-39 344 99.2 (22.2) 98.9 (21.6)
40-49 348 97.3 (23.3) 96.4 (24.3)
50-59 291 95.7 (23.2) 93.8 (23.1)
60-69 315 83.4 (21.4) 79.1 (18.2)
70-79 197 85.2 (21.8) 83.7 (20.6)
80-89 35 83.1 (22.5) 82.7 (25.0)
Patient Tests With
[is greater than
or equal to] 3
Acceptable
Age, yr [FEV.sub.1]/FVC, %pred Blows, No. (%)([dagger])
< 10 102.3 (10.2) 0 (0)
10-19 95.5 (13.2) 0 (0)
20-29 101.6 (11.8) 2 (2.1)
30-39 101.4 (13.9) 11 (8.6)
(*) Values given as No. (%) unless otherwise indicated. NS =
not significant.
([dagger]) Poor start defined as the extrapolated volume > 5%
FVC or 150 mL, whichever amount is greater.
([double dagger]) Poor start defined as time-to PEF > 85 ms, as
in the Lung Health study?
Table 3--Lung Function and Spirometry Quality Assurance Data
Characterized by Age(*)
40-49 99.9 (14.8) 11 (9.0)

50-59 99.3 (17.1) 15 (12.9)
60-69 101.0 (20.5) 22 (20.0)
70-79 98.7 (32.2) 14 (18.2)
80-89 95.5 (20.3) 1 (5.9)
(*) Values given as mean [+ or -] SD. %pred = percent
predicted.
([dagger]) Weeks 0 to 11.
Practical and Written Spirometry Assessments
The practical spirometry assessment performed at week 12
demonstrated a higher proportion of practitioners performing
"acceptable" spirometry in the trained group (16/24 [67%]) than
in the usual group (4/25 [16%]) (p = 0.0004) (Fig 3). Written
assessments were performed by the trained group before and
after their spirometry workshop at week 0, with mean
preworkshop scores of 16 (range, 14.3 to 17.7) for doctors and
6.2 (range, 4.5 to 7.8) for nurses. Immediately after the
workshop, scores improved to a mean scores of 26.3 (range,
23.7 to 28.8) and 17.6 (range, 15.3 to 20), respectively (p [is
less
than] 0.0001) (Fig 4). At week 12, immediately before the
maintenance of standards workshop, the scores had fallen for
doctors to a mean of 23.5 (range, 21.7 to 25.3; p = 0.02) but had
not for nurses (mean, 17.1; range, 13.6 to 20.6).
Indications for and Interpretation of Spirometry
A random selection of 580 of 1,012 patient tests (57%) were
further analyzed. Patient demographics did not differ
significantly
in this subgroup. The indications given for performing
spirometry were the following: management of asthma (41%);
investigation of respiratory symptoms (22%); COPD (14%);

screening of asymptomatic smokers (8%); and
insurance/medical
exams (3%). Practitioners believed that results of spirometry
testing helped in counseling smokers in 13% of patient tests. A
total of 559 patient tests were assessed by two experienced
pulmonologists. The primary care physician's interpretation was
judged correct in 296 of 559 patient tests (53%). The proportion
of correct interpretations did not differ significantly between
the
trained and usual groups of primary care physicians.
DISCUSSION
This is the first study to formally address the quality of
spirometry performed in primary care practice. Spirometry
performed in
clinical primary care practice did not generally satisfy ATS
criteria for acceptability and reproducibility, both before and
after
formal training, although a significant training effect was
observed. Is it possible the ATS criteria are unnecessarily
rigorous?
Published data would suggest not. The Lung Health Study
demonstrated that in only 2.1% of test sessions were
participants
unable to produce three acceptable FVC maneuvers, with the
two best [FEV.sub.1] values matching within 5% or 100 mL.[9]
This was more stringent than the ATS criteria of 200 mL that
we used.[11] However, a direct comparison with the Lung
Health
study may not be entirely appropriate due to differences in the
study population; patients with COPD may find it easier to
comply, especially with end-of-test criteria. The primary aim of
our study was to address the quality of spirometry performed in
clinical practice. This necessarily required the use of well-
defined standard objective criteria. We believe that these

criteria
served the purpose of assessing the effect of training, but this
study was not designed to assess the impact of poorly
performed spirometry on clinical practice.
There is a danger of undue negativity based on an uncritical
"interpretation" of our data. The ATS statement allows for the
use
of data from unacceptable or nonreproducible maneuvers at the
discretion of the interpreter. We appreciate that a blow of [is
less than] 6 s may be acceptable if a plateau has been achieved.
Further analysis suggested that this was only achieved in a
small proportion of blows. It is likely that primary care
practitioners, practice nurses, and, indeed, the patients have
been well
schooled in the use of a peak flow meter for asthma. Since
asthma was given as the indication for testing in 40% of eases,
this
may be a possible explanation for the relatively "good starts"
but unacceptably "short" blows. However, since the majority of
test failures were end-of-test related, the [FEV.sub.1] level is
likely to be valid and could be used for serial monitoring.
Failure
to meet the ATS criteria does not necessarily imply that the test
performance was clinically invalid or unusable. Satisfaction of
rigid criteria is likely to be crucial only when abnormalities are
marginal. In most eases, results are likely to be interpretable if
values are within normal limits or are substantially deranged.
However, minor abnormalities should be interpreted with great
care.
Our spirometry workshops were successful, as judged by the
traditional performance criteria of written, and practical
assessments. These assessments were incorporated into our
workshops both to reinforce educational goals and to provide
objective measures of the efficacy of our training. The pre-

workshop written assessment exposed low baseline knowledge.
Immediately postworkshop, the scores improved significantly,
albeit representing short-term recall only. The results of
retesting
at week 19, demonstrated the need for continuing education to
maintain standards. Our training program was very consistent in
the magnitude of its training effect. Following week 12, when
the usual groups had received the "initial" spirometry workshop
given to the trained group at week 0, both groups were
achieving a very similar proportion of acceptable blows. The
practical
assessments also demonstrated a significant training effect.
However, although statistically significant, the improvement
associated with training may not have been of clinical value. An
acceptable practical assessment did not necessarily translate
into acceptable spirometry when performed in clinical practice.
Prior to the development of "smart" spirometers, we did not
have the ability to objectively assess spirometry performed in
clinical practice. It is perhaps not unexpected that the results
were appreciably different from those obtained using more
traditional assessments. It is well recognized that knowledge
may
not reliably predict behavior.[16]
Although we primarily addressed the quality of spirometry
performance, the interpretations of spirometry were incorrect in
almost 50% of the eases reviewed. Accurate interpretation is
highly dependent not only on a well-performed procedure, but
also on an appreciation of physiology, appropriate choice of
normal values, and clinical knowledge of the patient. Our
results
indicate an important gap in knowledge and understanding that
dearly requires more training and experience than our

workshops could provide.
It is certainly possible that longer, more intensive workshops
may have produced better results. However, if spirometry is to
be
widely available in primary care practice, the sheer logistics of
training and maintaining standards among large numbers of
practitioners dictates a condensed and pragmatic training
program. Our workshops did pay particular attention to the
acceptability and reproducibility criteria and to the importance
of quality assurance. Following the workshops, all practitioners
at
week 12 also received individual feedback on their written and
practical assessments. Most importantly, we thought, they were
given individual feedback on the quality of the spirometry they
had performed in the previous 19, weeks, which had been
provisionally analyzed for acceptability and reproducibility.
The Lung Health Study demonstrated that, even in a dedicated
research setting with meticulous attention to quality, technician
performance fell over time.[9] However, the quality of
spirometry not only improved dramatically, but was maintained
with regular monitoring of test session quality and prompt
individual feedback. We selected the Vitalograph 2120
spirometer with its inbuilt quality assurance prompts with the
expectation that it would improve the quality of spirometry.
This, unfortunately, did not appear to be the ease, although, to
our
knowledge, a comparison of performance quality between
spirometers with and without quality prompts has not been
done.
Even when a spirometer is available on-site, underuse remains a
problem, as demonstrated by a Canadian study.[17] Despite
almost 60% of doctors having direct access to spirometry
equipment, primary care doctors had a low index of suspicion
for

COPD and markedly underused spirometry. In our study, the
mean number of patient tests per week was only 2.3. The
reasons for this are likely to be multifactorial and may change
with further education and the use of incentives.
Acknowledging
the recent National Lung Health Education Program
publication,[1] it is disappointing that, despite education,
spirometer was
dearly …
Vol. 333 No. 11 INCENTIVE SPIROMETRY FOR
PULMONARY COMPLICATIONS IN SICKLE CELL
DISEASES 699
INCENTIVE SPIROMETRY TO PREVENT ACUTE
PULMONARY COMPLICATIONS IN SICKLE
CELL DISEASES
P
AUL
S. B
ELLET
, M.D., K

AREN
A. K
ALINYAK
, M.D., R
AKESH
S
HUKLA
, P
H
.D., M
ICHAEL
J. G

ELFAND
, M.D.,
AND
D
ONALD
L. R
UCKNAGEL
, M.D., P
H
.D.
Abstract
Background.

This study was designed to de-
termine the incidence of thoracic bone infarction in pa-
tients with sickle cell diseases who were hospitalized with
acute chest or back pain above the diaphragm and to test
the hypothesis that incentive spirometry can decrease the
incidence of atelectasis and pulmonary infiltrates.
Methods.
We conducted a prospective, randomized
trial in 29 patients between 8 and 21 years of age with
sickle cell diseases who had 38 episodes of acute chest
or back pain above the diaphragm and were hospitalized.
Each episode of pain was considered to be an independ-
ent event. At each hospitalization, patients with normal or
unchanged chest radiographs on admission were ran-
domly assigned to treatment with spirometry or to a con-
trol nonspirometry group. Each patient in the spirometry
group took 10 maximal inspirations using an incentive spi-
rometer every two hours between 8 a.m. and 10 p.m. and
while awake during the night until the chest pain subsid-
ed. A second radiograph was obtained three or more days
after admission, or sooner if clinically necessary, to de-
termine the incidence of pulmonary complications. Bone
scanning was performed no sooner than two days after
hospital admission to determine the incidence of thoracic
bone infarction.
Results.

The incidence of thoracic bone infarction
was 39.5 percent (15 of 38 hospitalizations). Pulmonary
complications (atelectasis or infiltrates) developed during
only 1 of 19 hospitalizations of patients assigned to the
spirometry group, as compared with 8 of 19 hospitaliza-
tions of patients in the nonspirometry group (P
�
0.019).
Among patients with thoracic bone infarction, no pulmo-
nary complications developed in those assigned to the
spirometry group during a total of seven hospitalizations,
whereas they developed during five of eight hospitaliza-
tions in the nonspirometry group (P
�
0.025).
Conclusions.
Thoracic bone infarction is common in
patients with sickle cell diseases who are hospitalized with
acute chest pain. Incentive spirometry can prevent the pul-
monary complications (atelectasis and infiltrates) associat-
ed with the acute chest syndrome in patients with sickle
cell diseases who are hospitalized with chest or back pain
above the diaphragm. (N Engl J Med 1995;333:699-703.)

From the Division of General Pediatrics (P.S.B.), the Division
of Hematology–
Oncology and the Cincinnati Comprehensive Sickle Cell Center
(K.A.K.,
D.L.R.), and the Department of Radiology (M.J.G.), Children’s
Hospital Medical
Center; and the Department of Environmental Health, Division
of Biostatistics
and Epidemiology, University of Cincinnati College of
Medicine (R.S.) — all
in Cincinnati. Address reprint requests to Dr. Bellet at the
Department of Pedi-
atrics, Children’s Hospital Medical Center, 3333 Burnet Ave.,
Cincinnati, OH
45229-3039.
Supported in part by a grant (5 P60 HL 15996) from the
National Heart, Lung,
and Blood Institute.
P
ATIENTS with sickle cell diseases are prone to an
acute chest syndrome of chest pain and the pres-
ence of pulmonary infiltrates on chest radiography.
1
The cause of most cases of the acute chest syndrome is

uncertain.
2
Pneumonia is often among the causes con-
sidered, but bacterial, viral, or mycoplasma infection is
infrequently documented.
3-8
Vaso-occlusion due to in-
travascular sickling of red cells in the lung and embo-
lism of thrombus or bone marrow
9
have never been
shown conclusively to be present in the majority of cas-
es. Although the illness is frequently self-limited when
the infiltrate is confined to a small area, it can progress
rapidly and may be fatal.
10
In some patients with the acute chest syndrome, ra-
dionuclide imaging showed focal changes in the bony

thorax (ribs, sternum, and thoracic vertebrae) indica-
tive of bone infarction.
11
In a retrospective analysis of
bone scans, we found a high degree of correlation be-
tween thoracic bone infarction and the presence of a
pulmonary infiltrate.
12
We propose that in many cases
the primary event leading to the acute chest syndrome
is thoracic bone infarction, which predisposes patients
to the development of the acute pulmonary complica-
tions of atelectasis or infiltrates. Analgesia to relieve
the pain of splinting and the use of the incentive spi-
rometer to ensure lung aeration may prevent these
complications. The incentive spirometer measures the
inspiratory capacity of the lungs and is designed to en-
courage deeper inspiratory effort. To test our hypothe-
sis, we conducted a prospective, randomized trial of in-
centive spirometry in patients with sickle cell diseases
who were hospitalized with acute chest or back pain
above the diaphragm. The incidence of thoracic bone
infarction was also determined.
M

ETHODS
Patients and Study Design
All patients received health care at the Comprehensive Sickle
Cell Center at Children’s Hospital Medical Center, Cincinnati.
Twenty-nine patients (14 female and 15 male) between 8 and 21
years of age with sickle cell diseases who had acute chest or
back
pain above the diaphragm and who were admitted to the hospital
were enrolled in the study between October 1, 1990, and August
1,
1994. Twenty-three patients had homozygous sickle cell
anemia,
three had sickle cell–hemoglobin C disease, two had sickle cell–
b
�
-
thalassemia, and one had sickle cell–hemoglobin D disease (Los
An-
geles). Reasons for hospital admission included acute chest or
back
pain, usually unrelieved by two doses of intravenous morphine;
fe-
ver; respiratory distress; a sharp decrease in the hemoglobin
concen-
tration; and a need for oxygen. These 29 patients were

hospitalized
a total of 38 times — 21 had 1 hospitalization, 7 had 2 , and 1
had
3. Patients less than 7 years old were not enrolled because of
their
difficulty in learning how to use the incentive spirometer
effectively.
Patients were admitted to the Hematology– Oncology Service of
the
Children’s Hospital Medical Center, and the attending
hematologist
was responsible for their treatment. The investigators
supervised
the use of the incentive spirometer and the collection of data.
The
study protocol was approved by the institutional review board,
and
The New England Journal of Medicine
Downloaded from nejm.org at SLIPPERY ROCK UNIV on May
3, 2020. For personal use only. No other uses without
permission.
Copyright © 1995 Massachusetts Medical Society. All rights
reserved.
700 THE NEW ENGL AND JOURNAL OF MEDICINE Sept.
14, 1995
informed consent was obtained from all study patients or their
par-
ents or legal guardians.

A complete history was taken and a physical examination per-
formed when the patients presented to the Comprehensive
Sickle Cell
Center or the emergency department. Some had pain elsewhere
than
in the chest or back, and others did not. The following tests
were per-
formed: complete blood and differential counts, reticulocyte
count,
measurement of hemoglobin F and S concentrations, blood
culture if
the patient was febrile, measurement of oxygen saturation with
a
pulse oximeter while the patient breathed room air, and chest
radi-
ography. At each hospitalization, patients with normal chest
radio-
graphs or radiographs that were unchanged since the previous
exam-
ination were randomly assigned to one of two groups — a
spirometry
group that received standard care as well as the use of the
incentive
spirometer, and a nonspirometry group that received standard
care
only. Bone scanning was performed during each hospitalization
to de-
termine the incidence of thoracic bone infarction.
Incentive spirometry was used by patients in the spirometry
group every two hours between 8 a.m. and 10 p.m. and while
they
were awake at night until the chest pain had subsided. The
stand-
ard Volurex volumetric incentive spirometer (Diemolding

Health-
care Division, Canastota, N.Y.) was used to measure inspiratory
ca-
pacity. The patients were asked to take 10 maximal inspirations,
and the inspiratory capacities on the 4th, 5th, and 6th
inspirations
were measured and recorded. A second chest radiograph was ob-
tained at least three days after admission to the hospital, or
sooner
if clinically necessary, to determine whether pulmonary
complica-
tions (atelectasis or infiltrates) had developed. We grouped
atelec-
tasis and infiltrates together because it is often difficult to
distin-
guish between them radiographically. The chest radiographs
were
interpreted in a blinded manner; that is, the radiologists did not
know whether a patient was assigned to the spirometry group or
the
nonspirometry group during a given hospitalization. Bone
scanning
was performed two hours after the intravenous administration of
technetium Tc 99m medronate (0.185 mCi per kilogram of body
weight; maximum, 12 mCi) no sooner than two days after
admis-
sion to the hospital to determine the incidence of thoracic bone
in-
farction. General-purpose or high-resolution parallel-hole
collima-
tion was used. Intravenous fluid — 5 percent dextrose in 0.45
percent sodium chloride — was routinely given at 1 to 1.5 times
the
maintenance rate for at least the initial 24 hours of
hospitalization.
Antibiotics were given for oral temperatures greater than 38

°
C and
for suspected or known bacterial infection. Blood transfusions
were
given when the hemoglobin concentration fell below 6 g per
deci-
liter.
Different analgesic agents were used to treat pain. The narcotics
included morphine, hydromorphone, meperidine, methadone,
fenta-
nyl, oxycodone, and codeine (with acetaminophen). The dosage
was
adjusted according to the patient’s comfort or tolerance. Non-
narcot-
ic analgesic agents included ketorolac, naproxen, ibuprofen, and
acetaminophen. The amount of narcotics given during each
hospital-
ization was recorded in milligrams of morphine equivalents per
kilo-
gram of body weight
13
and for ketorolac in milligrams of morphine
equivalents per kilogram, assuming that 30 mg of intravenous
keto-
rolac equals 12 mg of intravenous morphine.

14
The total narcotic dos-
age was compared in the spirometry and nonspirometry groups
(Ta-
ble 1) as a measure of the amount of pain that was experienced.
The data were entered in our computer data base and analyzed
with SAS software. Patients were randomly assigned to the
spirome-
try or nonspirometry group at each hospitalization, according to
the
forced-randomization procedure of Taves.
15
All dichotomized data
were analyzed with Fisher’s exact test for two-by-two tables,
and all
continuous data were analyzed with Student’s t-test. All the
tests
were two-tailed. Logistic-regression analysis was used to assess
the
effect of incentive spirometry on decreasing the incidence of
pulmo-
nary complications (atelectasis or infiltrates), independent of
other
confounding variables.

Since 8 of 29 patients had more than one hospitalization, we in-
vestigated whether each hospitalization could be treated as an
inde-
pendent event. To do this, we used the method described by
Liang
and Zeger.
16
Two sets of analyses were compared to assess the effec-
tiveness of incentive spirometry in preventing pulmonary
complica-
tions. In one analysis, the existence of within-patient
correlation was
assumed, whereas in the other, a within-patient correlation of
zero
was assumed. The P values for the regression coefficients for
treat-
ment effect were almost identical (0.0252 and 0.0261 for the
inde-
pendence model and the exchangeable-correlation model,
respec-
tively). These two Liang–Zeger analyses gave P values close to
that
obtained with Fisher’s exact test (P
�
0.019). Therefore, Fisher’s ex-
act test, which assumes the independence of events, could be
used

in the analysis to determine whether incentive spirometry
prevents
pulmonary complications.
As a measure of the effectiveness of the patient’s performance
in
*Plus –minus values are means
�
SD.
†N denotes number of hospitalizations. Patients were assigned
to one of the two groups at
each admission.
‡N
�
14. §N
�
16. ¶N
�

17.
Table 1. Selected Clinical Features of Patients Assigned to Spi-
rometry or Standard Care without Spirometry during
Hospitalization.
*
C
LINICAL
F
EATURE
S
PIROMETRY
(N
�

19)†
N
ONSPIROMETRY
(N
�
19)†
Sex — F/M 8/11 11/8
Age — yr 15.0
�
4.2 16.8
�
3.0
Genotype — no.
Homozygous sickle cell anemia 14 16
Sickle cell–hemoglobin C disease 3 1

Sickle cell–hemoglobin D disease
(Los Angeles)
0 2
Sickle cell–
b
�
-thalassemia 2 0
Temperature (oral) on admission —
°
C 37.6
�
1.2 37.3
�
0.8
Respiratory rate — per min 24.2

SERIES ‘‘ATSERS TASK FORCE STANDARDISATION OF LUNGFUNCTION.docx

SERIES ‘‘ATSERS TASK FORCE STANDARDISATION OF LUNGFUNCTION.docx

Recommended

Recommended

More Related Content

Similar to SERIES ‘‘ATSERS TASK FORCE STANDARDISATION OF LUNGFUNCTION.docx

Similar to SERIES ‘‘ATSERS TASK FORCE STANDARDISATION OF LUNGFUNCTION.docx (20)

More from klinda1

More from klinda1 (20)

Recently uploaded

Recently uploaded (20)

SERIES ‘‘ATSERS TASK FORCE STANDARDISATION OF LUNGFUNCTION.docx