More Related Content
Similar to Outcome of Lumbar Fusion (20)
More from Paul Coelho, MD (20)
Outcome of Lumbar Fusion
- 1. Clinical Study
The long-term outcome of lumbar fusion in the Swedish
lumbar spine study
Rune Hedlund, MD, PhDa,
*, Christer Johansson, MSca
, Olle HĂ€gg, MD, PhDb
,
Peter Fritzell, MD, PhDc
, Tycho Tullberg, MD, PhDd
, Swedish Lumbar Spine Study Group
a
Department of Orthopaedics, Salhgrenska University Hospital, Bruna strÄket 11, Gothenburg, SE 413 45, Sweden
b
Göteborg Spine Center, Gruvgatan 8, VÀstra Frölunda, SE 421 30, Sweden
c
Department of Orthopedics, LĂ€nssjukhuset, Ryhov, SE 551 85 Ryhov, Sweden
d
Stockholm Spine Center AB, Löwenströmska Sjukhuset, Upplands VÀsby, SE 194 89, Sweden
Received 19 February 2015; revised 30 July 2015; accepted 27 August 2015
Abstract BACKGROUND CONTEXT: Current literature suggests that in the long-term, fusion of the lumbar
spine in chronic low back pain (CLBP) does not result in an outcome clearly better than structured
conservative treatment modes.
PURPOSE: This study aimed to assess the long-term outcome of lumbar fusion in CLBP, and also
to assess methodological problems in long-term randomized controlled trials (RCTs).
STUDY DESIGN: A prospective randomized study was carried out.
PATIENT SAMPLE: A total of 294 patients (144 women and 150 men) with CLBP of at least 2
yearsâ duration were randomized to lumbar fusion or non-speciïŹc physiotherapy. The mean follow-
up time was 12.8 years (range 9â22). The follow up rate was 85%; exclusion of deceased patients
resulted in a follow-up rate of 92%.
OUTCOME MEASURES: Global Assessment (GA) of back pain, Oswestry Disability Index (ODI),
visual analogue scale (VAS) for back and leg pain, Zung depression scale were determined. Work
status, pain medication, and pain frequency were also documented.
METHODS: Standardized outcome questionnaires were obtained before treatment and at long-
term follow-up. To optimize control for group changers, four models of data analysis were used according
to (1) intention to treat (ITT), (2) âas treatedâ (AT), (3) per protocol (PP), and (4) if the conserva-
tive group automatically classify group changers as unchanged or worse in GA (GCAC). The initial
study was sponsored by Acromed (US$50,000âUS$100,000).
RESULTS: Except for the ITT model, the GA, the primary outcome measure, was signiïŹcantly better
for fusion. The proportion of patients much better or better in the fusion group was 66%, 65%, and
65% in the AT, PP, and GCAC models, respectively. In the conservative group, the same propor-
tions were 31%, 37%, and 22%, respectively. However, the ODI, VAS back pain, work status, pain
medication, and pain frequency were similar between the two groups.
CONCLUSIONS: One can conclude that from the patientâs perspective, reïŹected by the GA, lumbar
fusion surgery is a valid treatment option in CLBP. On the other hand, secondary outcome mea-
sures such as ODI and work status, best analyzed by the PP model, indicated that substantial disability
remained at long-term after fusion as well as after conservative treatment. The lack of objective outcome
measures in CLBP and the cross-over problem transforms an RCT to an observational study,
FDA device/drug status: Not applicable.
Author disclosures: RH: Grants: Acromed Corporation (D), Zimmer (B),
outside the submitted work; Consulting: K2M (A), Globus Medical (B),
Medtronic (B), Zimmer (B), outside the submitted work. CJ: Nothing to
disclose. OH: Other: Spine Center Göteborg (A), outside the submitted
work. PF: Nothing to disclose. TT: Dr Tullberg reports being CEO of
Stockholm Spine Center (SSC), a private spine clinic, and Stockholder in
SSC and in Global Health Partner, which is the main owner of SSC.
* Corresponding author. Department of Orthopaedics, Salhgrenska
University Hospital, Bruna strÄket 11, Gothenburg, SE 413 45, Sweden. Tel.:
+46313434060.
E-mail address: rune.hedlund@vgregion.se (R. Hedlund)
http://dx.doi.org/10.1016/j.spinee.2015.08.065
1529-9430/© 2015 Elsevier Inc. All rights reserved.
The Spine Journal 16 (2016) 579â587
Downloaded from ClinicalKey.com at Naval Medical Center - San Diego June 11, 2016.
For personal use only. No other uses without permission. Copyright ©2016. Elsevier Inc. All rights reserved.
- 2. that is, Level 2 evidence. The discrepancy between the primary and secondary outcome
measures prevents a strong conclusion on whether to recommend fusion in non-speciïŹc
low back pain. © 2015 Elsevier Inc. All rights reserved.
Keywords: Chronic low back pain; Conservative treatment; Long-term outcome; Lumbar fusion; Physical therapy;
Randomized trial
Introduction
Despite an abundance of clinical studies, the outcome of
fusion for chronic low back pain (CLBP) remains a highly
controversial subject in spine surgery. In the ïŹrst random-
ized controlled trial (RCT) comparing lumbar fusion with
conservative therapy, The Swedish Lumbar Spine Study Group
reported a positive short-term effect of surgery compared
with unstructured physiotherapy [1]. In contrast, in British
and Norwegian short-term as well as long-term studies, no
statistically or clinically relevant difference could be dem-
onstrated comparing fusion with physiotherapy and cognitive
therapy [2â5].
In an evidence based medicine perspective, well-designed
and executed RCTs are considered Level 1 studies on which
recommendations on treatment can be based. However, an im-
portant limitation of long-term RCTs that compare surgical
with conservative treatment is crossover between treat-
ments, which undermines the randomization process. A further
problem in long-term studies is follow-up rate; there is risk
of selection bias with suboptimal number of patients avail-
able for follow-up. In a recently combined British-Norwegian
study on CLBP [5], the follow-up rate was 55%, which is gen-
erally considered too low for robust conclusions.
The present study was performed to determine the long-
term outcome of fusion for CLBP treated with fusion or an
unstructured physiotherapy program. The data presented are
the long-term RCT follow-up of the Swedish Lumbar Spine
Study [1]. A further objective was to analyze the method-
ological problem associated with crossover patients in long-
term RCTs.
Materials and methods
Consecutively referred patients with CLBP aged 25â65
years [1] were eligible to participate in the study (Table 1).
The inclusion criteria were patients aged 25â65 years, male
and female, with severe CLBP of at least 2 yearsâ duration,
and with no signs of nerve root compression. Further inclu-
sion criteria were sick leave or âequivalentâ major disability
for at least 1 year and unsuccessful medical interventional
treatment efforts. Radiological inclusion criteria were de-
generative changes at L4âL5 or L5âS1 (âspondylosisâ) on
plain radiographs or computed tomography, or magnetic res-
onance imaging.
Exclusion criteria were previous spine surgery except for
successful removal of a herniated disc, spondylolysis, spon-
dylolisthesis, new or old fractures, infection, inïŹammatory
process, or neoplasm. No patient was diagnosed with a spon-
dylolysis intraoperatively.
There were 294 patients, 144 women and 150 men, with
a mean age of 47 years (range, 28â72). A 3:1 randomiza-
tion between different types of fusion and physiotherapy was
performed by a computer-generated random sequence, which
resulted in 222 patients in the fusion group and 72 in the con-
servative group. The different types of fusion were (1) non-
instrumented posterolateral fusion, (2) instrumented
posterolateral fusion with internal ïŹxation, (3) instru-
mented circumferential fusion with additional interbody bone
graft, either as anterior lumbar interbody fusion or posterior
lumbar interbody fusion. Only autografts were used. The dif-
ferent types of fusion resulted in a similar outcome at 2 years
[6], as well as at long-term follow-up, as observed in the
present study. Therefore, only the combined results of the fused
patients are presented. Crossover, that is, change of treat-
ment group post randomization, can be followed by the
ïŹowchart shown in Fig. 1.
The randomization resulted in a 40.6% smokers in the sur-
gical group and 49.3% in the conservative group, introducing
the risk of bias. The difference was accounted for by multi-
ple linear regression analysis, with adjustment for age, gender,
smoking, pretreatment pain duration, previous spine surgery,
and Oswestry Disability Index (ODI) at baseline, all well-
known risk factors for a worse outcome in spine surgery.
The primary outcome measure was Global Assessment
(GA) in which the patient classiïŹed the outcome as âmuch
better,â âbetter,â âunchanged,â or âworse.â Secondary outcome
Table 1
Baseline demographic and clinical characteristics
Surgical
group
(n=222)
Medical
interventional
group (n=72)
Age (range) 43 (25â64) 44 (26â63)
Sex (female) 112 (50.5%) 37 (51.4%)
Smoking 40.6% 49.3%
Comorbidity 39.1% 23.5%
Mean pain duration, years (range) 7.8 (2â34) 8.5 (2â40)
Mean time of sick leave, years (range) 3.2 (0.1â18) 2.9 (0.1â8)
Working part or full time 20.9% 23.6%
ODI (0â100) 47.3 (11.4) 48.4 (11.9)
VAS back pain (0â100) 64.2 (14.3) 62.6 (14.3)
VAS leg pain (0â100) 35.3 (25.4) 35.6 (25.2)
Zung depression scale (20â80) 39.1 (13.3) 39.4 (13.9)
ODI, Oswestry Disability Index; VAS, Visual Analogue Scale.
Data are means (standard deviation [SD]) or numbers (%) unless stated
otherwise.
580 R. Hedlund et al. / The Spine Journal 16 (2016) 579â587
Downloaded from ClinicalKey.com at Naval Medical Center - San Diego June 11, 2016.
For personal use only. No other uses without permission. Copyright ©2016. Elsevier Inc. All rights reserved.
- 3. measures were ODI, visual analogue scale (VAS) for
back pain, VAS for leg pain, patient satisfaction, Zung de-
pression scale, work status, pain frequency, and pain
medication.
In the conservative group 26 of 72 (36%) patients were
operated on eventuallyâ7 patients were operated on before
treatment was initiated and 19 were operated on post treat-
ment (Fig. 1). To compare the effect of different ways to
allocate group changers on the observed outcome, we ana-
lyzed the data according to four different models. The
deïŹnition and rationale of the different models can be sum-
marized as follows.
Model 1: The intention to treat model (ITT) is the only
way to retain the advantages of randomization. ITT reïŹects
the outcome after choosing a particular treatment at one point
in time, regardless of what happens later on.
Model 2: In the âas treatedâ (AT) model, patients are ana-
lyzed as surgically treated if they actually were operated on,
either as a consequence of randomization or as a consequence
of changing group from conservative to surgery. Group chang-
ersfromconservativetosurgicaltreatmentmaybiasbothgroups.
Model 3: In the âper protocolâ (PP) model, only patients
following their randomization through the complete study are
included. Group changers are excluded, which may intro-
duce selection bias in the primary group. The difference from
AT is that conservatively treated patients crossing over to
surgery are not included in the surgical group, keeping the
surgical group unbiased.
Model 4: This model, which has not been used in previ-
ous publications, in crossover patients (n=19) from conservative
treatment to fusion, the GA was automatically classiïŹed
Context
The authors present results of a prospective randomized
trial regarding the use of lumbar fusion or non-speciïŹc
physical therapy for the treatment of chronic low back pain.
Contribution
This was a prospective randomized study that included 294
patients and followed them for close to 13 years. A number
of different analytic approaches including intent to treat
and per protocol were used to compare outcomes between
the two treatment groups. Success rates in the lumbar fusion
group were in the range of 65%, although comparable im-
provement was markedly lower among those patients treated
using physical therapy. The authors cautiously interpret their
data.
Implications
This study presents Level II evidence as the authors ap-
propriately recognize. The reader should also note that the
study was funded by Acromed, which may introduce a
further potential for bias. While patients had superior results
following fusion as compared to physical therapy, a success
rate after fusion of only 65% is clearly cause for concern,
particularly in light of the risks of surgery and associated
health care costs. Findings for a Swedish population may
also not be completely translatable to the American de-
mographic. These factors should be considered when
applying the results of this study to clinical practice.
âThe Editors
Fig. 1. CONSORT 2010 Flow Diagram of the patients during the study period.
581R. Hedlund et al. / The Spine Journal 16 (2016) 579â587
Downloaded from ClinicalKey.com at Naval Medical Center - San Diego June 11, 2016.
For personal use only. No other uses without permission. Copyright ©2016. Elsevier Inc. All rights reserved.
- 4. (GCAC) as âunchangedâ or âworse,â and remained in the
conservative group. This classiïŹcation is based on the
assumption that an individual opting for surgery after having
tried conservative treatment has not experienced any im-
provement. With this model, no other outcome variable can
be analyzed because a numerical value would be totally
arbitrary.
This RCT has been registered at the Australian New
Zealand Clinical Trials Registry with trial registration number
ACTRN126150001625166.
Statistical analysis
Estimation of sample size has been reported previously [1].
Differences between groups were analyzed by multiple linear
regression analysis, with adjustment for age, gender, smoking,
pretreatment pain duration, previous spine surgery, and ODI
at baseline. Adjusted mean differences between treatment
groups and 95% conïŹdence intervals are reported. For GCAC,
differences between the two groups could only be analyzed
for the GA. This was performed by collapsing âmuch betterâ
and âbetterâ into an âimprovedâ group and the âunchangedâ
and âworseâ into a âfailedâ group, with differences between
groups analyzed by multivariable logistic regression analy-
sis adjusted for the same covariates as above. All p-values
were two-sided. Those p-values less than or equal to the alpha
level, set at .05, were considered statistically signiïŹcant. Sta-
tistical analyses were performed using IBM SPSS version 22
(Chicago, IL, USA).
Results
The mean follow-up time was 12.8 years (range, 9â22).
The follow-up rate was 251 of 294 patients (85%). Twenty
patients died, all from reasons unrelated to CLBP, exclu-
sion of which resulted in a follow-up rate of 92%. The mean
age at follow-up was 59 years (range, 39â79). At long-term
follow-up, 21% of the patients were above the standard re-
tirement age in Sweden (65 years).
The primary outcome measure, GA, was similar in the sur-
gical and the conservative group using the ITT model (Table 2).
In contrast, using the other three models, the AT, the PP, and
the GCAC, the GA was signiïŹcantly better in the fused group
(Tables 3â5).
For the other outcome measures, however, the results were
non-signiïŹcant. The only exception was Zung depression scale
in the AT model, which showed a slight, but signiïŹcant,
unfavorable result for the conservatively treated patients. Work
ability, as well as pain medication and pain frequency, were
not different between the groups, irrespective of model of
analysis.
In the longitudinal analysis according to ITT, both groups
improved statistically signiïŹcantly in ODI: the conservative
group 11.7 points, and the surgical group 10.5 points. The
ïŹndings were similar for the other outcome measures and ir-
respective of model of analysis (ITT, AT, PP). Compared with
baseline data, patients improved non-signiïŹcantly more in the
surgical compared with the conservative group; the differ-
ences were 0.4â4.1 points in ODI and 0.3â5.5 points in VAS
back pain, the range depending on the analysis model (Fig. 2).
Complications. There were no deaths in connection with
surgery. There were ïŹve patients (2%) with a general post-
operative complication: one patient with thrombosis and
pulmonary embolism, one with thrombosis, one with aspi-
ration with sepsis, one with pulmonary edema, and one
with heart failure with gastrointestinal bleeding. Of the
patients operated on with pedicle screws, 9 of 140 (6%)
had postoperative nerve root pain, necessitating reoperation
in 3 patients. Three deep (1.4%) and two superïŹcial infec-
tions were successfully handled with debridement. One patient
had severe and persistent donor site pain. Two patients suf-
fered a late implant-related infection, with subsequent
removal of the implant. During the ïŹrst 2 years, 17 of the
patients (8%) underwent reoperations. Implant removal to
control healing of fusion, if necessary, was part of the re-
search protocol. The exact indication for reoperation could
not be established in each individual patient. There were no
obvious complications in the medical interventional group [1].
Discussion
This study is the largest single RCT comparing fusion with
conservative treatment in CLBP. It also has the longest follow-
up and the highest follow-up rate, allowing an improved insight
into the long-term course. Irrespective of treatment, the pa-
tients were on average improved at long-term, but only by
about 10 ODI points. In all models of analysis except the ITT,
the GA was statistically signiïŹcantly better for fusion.
Except for the GA, the difference in outcome between the
two treatment modes was statistically as well as clinically non-
signiïŹcant. Substantial disability was observed for both groups,
showing that the effect of fusion as well as that of conser-
vative treatmentcannotbe considered a cure.In the totalmaterial
the level of disability was relatively high with an ODI between
35.8 and 40.1. Such a level of ODI is considered moderate
(21%â40%) on the border to severe (41%â60%) disability. It
should also be noted that at long-term, only 40% were working,
despite the majority (79%) being under normal retirement age.
The rather high Zung depression scale underlines the im-
pression that neither fusion nor unstructured physiotherapy
or natural history normalizes quality of life in patients with
CLBP in the long-term. Compared with the 2-year follow-
up,(1)thelevelofdisabilityatlong-termwaslargelyunchanged,
suggesting that irrespective of treatment there is neither im-
provement nor worsening over time (Fig. 2). The reason for
choosing âunstructuredâ physiotherapy as control treatment
was the absence of a generally accepted physiotherapeutic
technique in CLBP; thus, the treatment was individualized,
with local variations. In the early 1990s, at the time the present
study was designed, cognitive therapy, as used in the British
and Norwegian studies which started later, was not recog-
nized as a treatment option for low back pain.
582 R. Hedlund et al. / The Spine Journal 16 (2016) 579â587
Downloaded from ClinicalKey.com at Naval Medical Center - San Diego June 11, 2016.
For personal use only. No other uses without permission. Copyright ©2016. Elsevier Inc. All rights reserved.
- 5. Interestingly, the percentage of âworseâ patients is similar
between fused and conservatively treated patients, and this
is not supportive of the notion that fusion in CLBP results
in a substantial proportion of patients with failed back syn-
drome. The level of baseline comorbidity was higher in the
conservative group, possibly resulting in a relative underes-
timation of the effect of fusion compared with conservative
therapy.
Importance of analytical model
The ITT model of analysis is often criticized in the context
of comparing surgical and medical interventional treatment,
keeping crossover patients in the conservative group even if
they are eventually operated on. The ITT makes sense only
from the perspective of initial strategy by the clinician. For
example, choosing to recommend conservative treatment
knowing that surgery with time is an option is reïŹected in
the ITT analysis. In short, it reïŹects the effect of a strategy
rather than the effect of a certain treatment. The results of
such a strategy indicate that there is no difference in outcome
between direct surgery or keeping an option open for surgery.
The AT model and the PP model showed a similar outcome
for all variables studied. Approximately 2/3 of surgical pa-
tients and 1/3 of conservative patients classiïŹed themselves
as âmuch betterâ or âbetter.â The AT model results in a con-
tamination of the surgical group with group changers
from the conservative group, introducing the risk of bias
in both groups. Despite its obvious shortcomings, the AT
model has gained some support in clinical research lately,
for example, in the Spine Patient Outcomes Research Trial
studies [7].
The PP model reïŹects the most uncontaminated effect of
fusion, as only patients randomized to fusion are studied, and
the model gives the best estimate of the effect size [8]. Ac-
cording to the PP model, the effect of surgery was 11 ODI
Table 2
Intention to treat analysis (fusion vs. medical interventional treatment) on outcome at LTFU
Outcome measure
Fusion surgery (n=187)
Unadjusted mean (SD)
Medical interventional
treatment (n=64)
Unadjusted mean (SD)
Adjusted LTFU treatment
effect: fusion compared with
medical interventional (95% CI)*
Adjusted
p-value
Primary outcome
Global Assessment (back pain)
Much better 24% 24% â0.07 (â0.04 to 0.2) .640
Better 38% 27%
Unchanged 20% 40%
Worse 18% 9%
Secondary outcome
ODI (0â100)
Baseline 47.1 (11.6) 48.5 (11.7)
LTFU 36.6 (18.3) 36.8 (18.6) â0.4 (â5.7 to 4.9) .880
VAS LBP (0â100)
Baseline 64.1 (14.3) 63.0 (14.4)
LTFU 46.7 (23.5) 46.3 (24.6) â0.3 (â7.3 to 6.7) .937
VAS LP (0â100)
Baseline 34.4 (25.5) 36.2 (25.8)
LTFU 41.6 (44.8) 36.7 (26.7) 5.6 (â6.8 to 18.0) .376
Pain medication
Several times a day 26% 24% â0.16 (â0.49 to 0.15) .310
Every day 20% 10%
Occasionally 33% 45%
Never 21% 21%
Pain frequency
Always 43% 42% â0.08 (â0.43 to 0.26) .637
Daily 26% 20%
Several times a week 11% 12%
Occasionally 20% 26%
Work status
Full-time/part-time 38% 39% 0.15 (â0.23 to 0.55) .427
Not working 62% 61%
Zung depression scale (20â80)
Baseline 43.5 (8.2) 43.0 (8.7)
LTFU 41.6 (10.7) 41.7 (10.8) â0.4 (â3.9 to 3.2) .844
Would you go through the same treatment again? 75% Yes 85% Yes 2.07 (0.89 to 4.82) .093â
CI, conïŹdence interval; ITT, intention to treat; LTFU, long-term follow-up; ODI, Oswestry Disability Index; SD, standard deviation; VAS LBP, visual
analogue scale for low back pain; VAS LP, visual analogue scale for leg pain.
* Adjusted for age, gender, smoking status, previous back surgery, duration of low back pain, and baseline ODI values.
â
Adjusted odds ratio from a multivariable logistic regression analysis with adjustment for baseline variables.
583R. Hedlund et al. / The Spine Journal 16 (2016) 579â587
Downloaded from ClinicalKey.com at Naval Medical Center - San Diego June 11, 2016.
For personal use only. No other uses without permission. Copyright ©2016. Elsevier Inc. All rights reserved.
- 6. Table 3
As-treated analysis (fusion vs. medical interventional treatment) on outcome at LTFU
Outcome measure
Fusion surgery (n=205)
Unadjusted mean (SD)
Medical interventional
treatment (n=46)
Unadjusted mean (SD)
Adjusted LTFU treatment
effect: fusion compared with
medical interventional (95% CI)*
Adjusted
p-value
Primary outcome
Global Assessment (back pain)
Much better 28% 9% â051 (â0.85 to â0.17) .004
Better 38% 22%
Unchanged 18% 53%
Worse 16% 16%
Secondary outcome
ODI (0â100)
Baseline 47.4 (11.2) 47.6 (13.0)
LTFU 35.8 (18.6) 40.1 (17.3) â4.1 (â10.2 to 1.9) .182
VAS LBP (0â100)
Baseline 64.0 (14.2) 62.9 (15.1)
LTFU 45.6 (23.7) 51.0 (23.5) â5.5 (â13.5 to 2.5) .175
VAS LP (0â100)
Baseline 34.7 (25.4) 35.5 (26.2)
LTFU 40.6 (43.9) 39.1 (25.3) 1.3 (â12.7 to 15.3) .859
Pain medication
Several times a day 27% 20% â0.14 (â0.51 to 0.22) .442
Every day 19% 13%
Occasionally 33% 49%
Never 21% 18%
Pain frequency
Always 43% 42% 0.06 (â0.33 to 0.47) .736
Daily 23% 31%
Several times a week 11% 11%
Occasionally 23% 16%
Work status
Full-time/part-time 39% 39% â0.14 (â0.60 to 0.31) .529
Not working 61% 61%
Zung depression scale (20â80)
Baseline 43.3 (8.2) 43.7 (9.1)
LTFU 40.7 (10.5) 46.8 (10.7) â5.8 (â10.1 to â1.7) .006
Would you go through the same treatment again? 76% 86% 1.65 (0.59 to 4.60) .341â
CI, conïŹdence interval; ITT, intention to treat; LTFU, long-term follow-up; ODI, Oswestry Disability Index; SD, standard deviation; VAS LBP, visual
analogue scale for low back pain; VAS LP, visual analogue scale for leg pain.
* Adjusted for age, gender, smoking status, previous back surgery, duration of low back pain, and baseline ODI values.
â
Adjusted odds ratio from a multivariable logistic regression analysis with adjustment for baseline variables.
Fig. 2. As treated analysis of conservatively and fused patients. ODI score at baseline, at 2 years, and at mean 12.8 years follow-up. The difference at long-
term follow-up was statistically non-signiïŹcant. Error bars: 95% CI.
584 R. Hedlund et al. / The Spine Journal 16 (2016) 579â587
Downloaded from ClinicalKey.com at Naval Medical Center - San Diego June 11, 2016.
For personal use only. No other uses without permission. Copyright ©2016. Elsevier Inc. All rights reserved.
- 7. points and 18 VAS back pain points, barely reaching
reported levels of clinically minimal important difference
which, for ODI, is usually reported to be from 10 [9] to 20
points [10].
Not surprisingly, the largest difference between the groups
in GA was obtained when all patients crossing over from con-
servative treatment to surgery (26% of conservative group)
were automatically classiïŹed as âunchangedâ or âworseâ in
the conservative group, an outcome that can be considered
as âfailedâ (Model 4). Such an analysis resulted in a larger
difference between the groups in the percentage of âfailed
treatmentâ: 35% for surgery versus 75% for conservative treat-
ment. The logic behind classifying these patients as failures
is that the conservative treatment most likely has not re-
sulted in an acceptable improvement. A reallocation of these
patients into the fusion group (AT model), or deleting them
Table 4
Per protocol analysis (fusion vs. medical interventional treatment) on outcome at LTFU
Outcome measure
Fusion surgery (n=179)
Unadjusted mean (SD)
Medical interventional
treatment (n=39)
Unadjusted mean (SD)
Adjusted LTFU treatment
effect: fusion compared with
medical interventional (95% CI)*
Adjusted
p-value
Primary outcome
Global Assessment (back pain)
Much better 26% 11% â0.37 (â0.72 to â0.01) .044
Better 39% 26%
Unchanged 18% 53%
Worse 17% 10%
Secondary outcome
ODI (0â100)
Baseline 47.4 (11.5) 48.7 (13.1)
LTFU 36.0 (18.6) 38.8 (18.3) â2.4 (â8.9 to 4.0) .445
VAS LBP (0â100)
Baseline 64.2 (14.5) 63.0 (15.4)
LTFU 46.4 (23.6) 50.8 (24.3) â4.5 (â12.9 to 4.0) .298
VAS LP (0â100)
Baseline 34.6 (25.6) 37.7 (25.7)
LTFU 41.0 (45.4) 37.0 (24.5) 4.2 (â11.0 to 19.4) .586
Pain medication
Several times a day 26% 19% â0.24 (â0.64 to 0.14) .216
Every day 20% 11%
Occasionally 32% 51%
Never 22% 19%
Pain frequency
Always 43% 41% â0.01 (â0.43 to 0.40) .955
Daily 24% 27%
Several times a week 12% 13%
Occasionally 21% 19%
Work status
Full-time/part-time 38% 41% 0.02 (â0.46 to 0.51) .934
Not working 62% 59%
Zung depression scale (20â80)
Baseline 43.4 (8.0) 44.3 (9.1)
LTFU 41.1 (10.6) 45.1 (11.2) â4.33 (â8.83 to 0.17) .059
Would you go through the same treatment again? 75% 84% 1.58 (0.65 to 5.13) .254â
CI, conïŹdence interval; ITT, intention to treat; LTFU, long-term follow-up; ODI, Oswestry Disability Index; SD, standard deviation; VAS LBP, visual
analogue scale for low back pain; VAS LP, visual analogue scale for leg pain.
* Adjusted for age, gender, smoking status, previous back surgery, duration of low back pain and baseline ODI values.
â
Adjusted odds ratio from a multivariable logistic regression analysis with adjustment for baseline variables.
Table 5
Results according to GCAC analysis for Global Assessment at 12.8 years follow-up
Outcome measure
Fusion
surgery
(n=187)
Medical
interventional
treatment (n=57)
Adjusted LTFU treatment
effect: fusion compared with
medical interventional (95% CI) p-Value
Global Assessment (back pain)
Much better/better 65% 25% 4.98 (2.47â9.12)* <.001
Unchanged/worse 35% 75%
* Adjusted odds ratio from a multivariable logistic regression analysis with adjustment for baseline variables.
585R. Hedlund et al. / The Spine Journal 16 (2016) 579â587
Downloaded from ClinicalKey.com at Naval Medical Center - San Diego June 11, 2016.
For personal use only. No other uses without permission. Copyright ©2016. Elsevier Inc. All rights reserved.
- 8. (PP model), automatically results in a falsely improved con-
servative group.
On the other hand, classifying group changers from con-
servative group to fusion group as âunchangedâ or âworseâ
leads to other potential risks of bias. CLBP is a disorder that
varies in severity over time, and to classify conservatively
treated patients as âunchangedâ or âworseâ if they at any time
after inclusion are fused may distort the analysis. The po-
tential bias introduced by such an automatic classiïŹcation is
evident from the fact that the opposite logic does not apply
for the fused patient; that is, the fused patient cannot change
group and, thus, does not run the risk of being classiïŹed as
âunchangedâ or âworseâ if temporarily in a non-improved
state. Also, it may be argued that patients in the fusion group
requiring a second procedure should automatically be clas-
siïŹed as failures in the fusion group. This was, however, not
feasible because the study plan included instrumentation
removal after the 2-year follow-up to control fusion status.
It was not possible to distinguish planned reoperations for as-
sessing fusion from unplanned reoperations. In addition,
patients in the surgical group most likely did receive various
potentially efïŹcient conservative therapies in the long course
of follow-up without automatically being classiïŹed as fail-
ures. Thus, the problem of crossover patients cannot easily
be solved; either one runs into the problem of failed conser-
vative patients not being included in their original group
(Models 2 and 3) or the potential problem of classifying too
many of them as âfailuresâ (Model 4).
Impact of outcome measure
In contrast to GA, no difference to the advantage of surgery
could be demonstrated when analyzing standard outcome mea-
sures such as ODI, VAS back pain, or work status. The
differences between the groups were to the advantage of
surgery, but they were neither statistically nor clinically sig-
niïŹcant. The same was true for âfrequency of pain medicationâ
and also for âfrequency of painâ; over 40% of patients in both
groups reported permanent pain. Similarly, to the question
whether the patient âwanted to go through the same treat-
ment again, knowing the outcome,â there was no statistically
signiïŹcant difference between the groups; 85% of conserva-
tively treated patients and 75% of fused patients answered
yes to this question.
Because the GA was the primary outcome in the present
study, and in quality of care discussions is often considered
the most important patient-related outcome measure (PROM),
our ïŹndings do lend some support to the conclusion of fusion
being superior to non-speciïŹc physiotherapy in CLBP. There
are, however, drawbacks with variables such as GA [11,12].
Although GA has been reported to correlate highly to other
outcome measures, it has the potential of recall bias, which
is most likely more pronounced in long-term follow-up studies
[13]. It is also reported that GA of the present type has a ten-
dency to overestimate the effect of surgical treatment [11].
Furthermore, it has been suggested that retrospective mea-
sures, such as the GA, are more affected by placebo than
prospective measures [14]. Thus, a consideration of also the
other outcome measures, not prone to recall bias, and not
showing any differences between the groups, provides a fuller
picture. Unfortunately, truly objective outcome measures in
CLBP are not available. Ability to work, which may be the
most objective outcome measure in the present study, was
similar between the groups. One may conclude that the
question on optimal model of analysis, as well as the one on
optimal outcome measures, cannot be easily answered.
Outcome in relation to other studies
In contrast to Mannion et al., we found a better GA in the
fusion group compared with the conservative group. The dif-
ference may be explained by the fact that their study was based
on physiotherapy, including cognitive therapy, whereas our
control group received non-structured physiotherapy with local
variations. Our control group may rather reïŹect natural history
of CLBP, with variable physiotherapeutic interventions off
and on.
Several systematic reviews and meta-analyses compar-
ing surgical and conservative treatment of CLBP have reported
a small effect favoring surgery [15â19]. The pooled mean ODI
difference of â2.91 in the study of Saltychev et al. [18],
however, was considered neither clinically nor statistically sig-
niïŹcant. Thus, several systematic reviews on largely the same
patient populations comparing fusion with conservative treat-
ment have come to similar conclusions: Fusion has a small
to moderate effect of questionable clinical signiïŹcance on func-
tional disability and pain. A robust conclusion favoring any
treatment has not been possible.
The strength of the present study is the randomized design,
the long-term follow-up, the high follow-up rate, and the
use of a wide range of validated outcome measures. There
are several limitations of the study, the most important being
one of the study objectives: how to quantify outcome in the
face of crossover patients. Despite attempts to control for
crossover patients by multiple models of analysis, the problem
remains, and the advantage of the RCT is lost; the RCT turns
into a prospective cohort study, and thereby must be consid-
ered generating Level 2 evidence. The present study clearly
demonstrates the difïŹculty of assessing long-term out-
comes in RCTs on CLBP comparing surgery with conservative
treatment. The same difïŹculties apply to previous RCTs
on the same subject, although these are not extensively
discussed.
Notwithstanding these problems, the different models of
analysis used in the present study resulted in a rather uniform
picture. The GA, the primary outcome measurement, was
found to be superior for surgical treatment compared with con-
servative treatment in three out of four analytical models, with
seemingly clinically important differences. Whether todayâs
selection of patients, or fusion techniques, or conservative
modes of treatment changes the long-term outcome cannot
be answered by the present study.
586 R. Hedlund et al. / The Spine Journal 16 (2016) 579â587
Downloaded from ClinicalKey.com at Naval Medical Center - San Diego June 11, 2016.
For personal use only. No other uses without permission. Copyright ©2016. Elsevier Inc. All rights reserved.
- 9. Conclusions
One can conclude that from the patientâs own perspec-
tive, reïŹected by the GA, lumbar fusion surgery can be
considered in CLBP, as deïŹned in the present study. On the
other hand, obtained data imply substantial remaining long-
term disability, suggesting a limited effect of lumbar fusion
in CLBP, best assessed by the PP model. The most impor-
tant result of the study may be that the crossover problem
transforms RCTs on surgical treatments to observational
studies, that is, Level 2 evidence.
References
[1] Fritzell P, Hagg O, Wessberg P, Nordwall A, Swedish Lumbar Spine
Study Group. 2001 Volvo Award Winner in Clinical Studies: lumbar
fusion versus nonsurgical treatment for chronic low back pain: a
multicenter randomized controlled trial from the Swedish Lumbar Spine
Study Group. Spine 2001;26:2521â32, discussion 32â4.
[2] Brox JI, Reikeras O, Nygaard O, SĂžrensen R, Indahl A, Holm I, et al.
Lumbar instrumented fusion compared with cognitive intervention and
exercises in patients with chronic back pain after previous surgery for
disc herniation: a prospective randomized controlled study. Pain
2006;122:145â55.
[3] Brox JI, Sorensen R, Friis A, Nygaard Ă, Indahl A, Keller A, et al.
Randomized clinical trial of lumbar instrumented fusion and cognitive
intervention and exercises in patients with chronic low back pain and
disc degeneration. Spine 2003;28:1913â21.
[4] Fairbank J, Frost H, Wilson-MacDonald J, Yu LM, Barker K, Collins
R, et al. Randomised controlled trial to compare surgical stabilisation
of the lumbar spine with an intensive rehabilitation programme for
patients with chronic low back pain: the MRC spine stabilisation trial.
BMJ 2005;330:1233.
[5] Mannion AF, Brox JI, Fairbank JC. Comparison of spinal fusion and
nonoperative treatment in patients with chronic low back pain: long-term
follow-up of three randomized controlled trials. Spine J 2013;13:1438â
48.
[6] Fritzell P, Hagg O, Wessberg P, Nordwall A, Swedish Lumbar
Spine Study Group. Chronic low back pain and fusion: a comparison
of three surgical techniques: a prospective multicenter randomized
study from the Swedish Lumbar Spine Study Group. Spine 2002;1131â
41.
[7] Weinstein JN, Lurie JD, Tosteson TD, Zhao W, Blood EA, Tosteson
AN, et al. Surgical compared with nonoperative treatment for lumbar
degenerative spondylolisthesis: four-year results in the Spine Patient
Outcomes Research Trial (SPORT) randomized and observational
cohorts. J Bone Joint Surg Am 2009;91:1295â304.
[8] Hernan MA, Hernandez-Diaz S. Beyond the intention-to-treat in
comparative effectiveness research. Clin Trials 2012;9:48â55.
[9] Hagg O, Fritzell P, Nordwall A, Swedish Lumbar Spine Study Group.
The clinical importance of changes in outcome scores after treatment
for chronic low back pain. Eur Spine J 2003;12:12â20.
[10] Carragee EJ, Cheng I. Minimum acceptable outcomes after lumbar
spinal fusion. Spine J 2010;10:313â20.
[11] Godil SS, Parker SL, Zuckerman SL, Mendenhall SK, Devin CJ, Asher
AL, et al. Determining the quality and effectiveness of surgical
spine care: patient satisfaction is not a valid proxy. Spine J
2013;13:1006â12.
[12] Copay AG, Martin MM, Subach BR, Carreon LY, Glassman SD,
Schuler TC, et al. Assessment of spine surgery outcomes: inconsistency
of change amongst outcome measurements. Spine J 2010;10:291â6.
[13] Hagg O, Fritzell P, Oden A, Nordwall A, Swedish Lumbar Spine Study
Group. Simplifying outcome measurement: evaluation of instruments
for measuring outcome after fusion surgery for chronic low back pain.
Spine 2002;27:1213â22.
[14] Hrobjartsson A, Gotzsche PC. Is the placebo powerless? An analysis
of clinical trials comparing placebo with no treatment. N Engl J Med
2001;344:1594â602.
[15] Ibrahim T, Tleyjeh IM, Gabbar O. Surgical versus non-surgical treatment
of chronic low back pain: a meta-analysis of randomised trials. Int
Orthop 2008;32:107â13.
[16] Mirza SK, Deyo RA. Systematic review of randomized trials comparing
lumbar fusion surgery to nonoperative care for treatment of chronic
back pain. Spine 2007;32:816â23.
[17] Chou R, Baisden J, Carragee EJ, Resnick DK, Shaffer WO, Loeser JD.
Surgery for low back pain: a review of the evidence for an American
Pain Society Clinical Practice Guideline. Spine 2009;34:1094â
109.
[18] Saltychev M, Eskola M, Laimi K. Lumbar fusion compared with
conservative treatment in patients with chronic low back pain: a
meta-analysis. Int J Rehabil Res 2014;37:2â8.
[19] Bydon M, De la Garza-Ramos R, Macki M, Baker A, Gokaslan AK,
Bydon A. Lumbar fusion versus nonoperative management for treatment
of discogenic low back pain: a systematic review and meta-analysis
of randomized controlled trials. J Spinal Disord Tech 2014;27:297â
304.
587R. Hedlund et al. / The Spine Journal 16 (2016) 579â587
Downloaded from ClinicalKey.com at Naval Medical Center - San Diego June 11, 2016.
For personal use only. No other uses without permission. Copyright ©2016. Elsevier Inc. All rights reserved.