It is argued that when it comes to nuisance parameters an assumption of ignorance is harmful. On the other hand this raises problems as to how far one should go in searching for further data when combining evidence.
2. 2
Basic thesis
• A disadvantage of likelihood based frequentist methods
is that dealing with nuisance parameters is difficult
• In theory Bayesian methods much more powerful for this
purpose
– Greater flexibility
• Frequentist methods tend to be either or
• Bayesian can have compromise positions quite naturally
• However, in practice dealing with nuisance parameters is
not so easy even in a Bayesian framework
• In particular, uninformative priors on nuisance
parameters can be disastrous
3. 3
Anyone who is not shocked by quantum theory has not understood
a single word.
Niels Bohr
Anyone who is not shocked by the Bayesian theory of statistical
inference has not understood it
Stephen Senn
4. 4
Examples
• Carry-over for cross-over trials
• Variances in multi-armed parallel group
trials
• Slopes for covariates in analysis of
covariance
• Trial variances in meta-analysis
• Random effects variances
• Various biases (e.g. when data are
missing)
5. 5
Hills andArmitageEneuresis Data
10
8
14
2
12
6 1210
6
4
2
0
40 8
Drynights placebo
Line of equality
Sequence Drug Placebo
Sequence placebo drug
Cross-over trial in
Eneuresis
Two treatment periods of
14 days each
Treatment effect
significant if carry-over
not fitted
2.037 ( 0.768, 3.306)
Treatment effect not
significant if carry-over
fitted
0.451 (-2.272, 3.174)
1. Hills, M, Armitage, P. The two-period
cross-over clinical trial, British Journal of Clinical
Pharmacology 1979; 8: 7-20.
6. 6
Identical ‘uninformative’
prior placed on carry-over
as for treatment
NB Parameterisation here
means that values of
need to be doubled to
compare to conventional
contrasts
7. 7
Identical Priors for Treatment and Carryover?
• Patients treated repeatedly during trial
• Fourteen day treatment period
• Average time to last treatment plausibly 4 hours
• Average time to previous treatment seven days
• Saying that it is just as likely that carry-over
could be greater than treatment is not coherent
• In any case the two cannot be independent
• Is negative carry-over as likely as positive carry-
over?
10. 10
Nowadays more and more people know less and less about robust
statistics.
Frank Hampel
Nowadays more and more people know less and less about Bayesian
statistics and it BUGS me.
Stephen Senn
11. 11
Analysis by Preece. Note the clear heterogeneity in variances.
Lowest when comparing optical isomers
Highest when comparing treatments to control
12. 12
Why do we pool variances in
multi-armed trials?
• Hang up from agriculture
– Experiments with few degrees of freedom
• 15
• Clinical trials different
– have many degrees of freedom
• >200
• For large phase III trials hardly any point in
pooling
• For small trials?
– Bayesian approach
13. 13
A Problem with Conventional
Meta-analyses
• Tend to weight trials by observed precision
• But this depends on trial variances
• These are random variables
• Net result is that estimates can be inefficient
• Coverage probabilities poor
• This is identically a problem for fixed and
random effects meta-analysis
• Is difficult to deal with even in a Bayesian
approach for cases with few trials
14. 14
0 5 10 15 20 25 30
Numberofcentres
0.0
0.2
0.4
0.6
0.8
1.0
Probability
Twodegreesoffreedom
Fourdegreesoffreedom
Probability that ratio of variances will be at least 10 as a function
of number of centres
15. 15
Lambert et al 2005
• Effect of choice of prior for random effects
variance on conclusions of meta-analysis
studied
– 13 priors investigated
• Example of meta-analysis of five trials of long
versus short therapy in otitis media
• Simulation based on this example
– 3 sample sizes: 5,10,30
– 3 random effect variances: 0.001,0.3,0.8
– 1000 runs per combination of prior, sample size and
variance
– 1000 ×13×3×3 = 117,000 MCMC calculations
16. 16
Study Short course Long course
Boulesteix, 1995 11/124 11/118
Cohen, 1997 26/186 31/184
Hendrickse, 1988 14/74 6/77
Hoberman, 1997a 57/197 24/178
Hoberman, 1997b 57/197 40/189
The Original Data
(Kozyrskyj et al, 2000)
17. 17
The difference between the Cochrane
Collaboration and me
The CC is always very concerned
that you may have missed some of
the evidence.
I am more concerned that you will
have found some evidence that
isn’t there
18. 18
Problems with the Priors
The priors for are specified independently of those for but this is clearly
incoherent. We do not believe in large variation of the true treatment effect from
trial to trial unless the treatment effect is on average large.
Prior number 13 particularly inappropriate. It uses the harmonic mean of the
within-trial variances to establish the prior parameter. However these variances
are a) logically a posteriori to the prior and b) depend on the size of trials
investigators happen to have chosen. It is curious that this should be taken as
being a relevant consideration for ones prior belief!
19. 19
Allowing dependence between the prior
distributions
1
; , exp .f
Large values of the random effect variance are implausible for modest
treatment effects. Therefore a dependence between the two seems
reasonable. One approach is to form a conditional prior for the random effects
variance given the treatment effect. Perhaps something like this
The following slide gives posterior distributions for four possible such prior
distributions and the example considered by Lambert et al. This is done using
numerical integration in Mathcad
21. 21
This Raises an Issue
• Rather than using a theoretical model how
about some data?
• We now have a large number of
completed meta-analyses what do they
teach us about the relationship between
random effect variances and treatment
effects?
• The following is based on results from 125
such analyses (Engels et al, 2000)
22. 22
RR ratioagainst theAbsolutevalueof theLog(Randomeffects RR meanest)
20
1.4
10
1.2
0
1.00.80.60.40.2
15
0.0
25
5
AbsoluteValueof theLog(RandomEffects RelativeRiskmeanestimates)
Thanks to Eric
Engels and
Joseph Lau for the
data and Nicola
Greenlaw for the
analysis
23. 23
Treatment Effect
Relative Risk Risk
Difference
Odds
Ratio
Random
Effects
Variance
Relative Risk 0.4238 - -
Risk
Difference
- 0.3524 -
Odds Ratio - - 0.4194
Correlations for three measures of
the treatment effect
Calculation by N Greenlaw, all results significant p = 0.0001 or less
24. 24
So how should I analyse these
data as a Bayesian?
• Build a hyper model for all meta-analyses to
date?
• Use this to inform my priors for random effect
variances for the next meta-analysis
• How about the treatment effect itself?
• Am I led inevitably into analysing everything
when I just want to analyse something?
• Am I faced with an ineluctable extended
context?
25. 25
The myth of objective Bayes
• Bayesian analysis cannot be a means of
recovering the truth
• It is a means of recovering your truth
• What do you believe?
• This must be specified in your prior
• Remember it is impossible for another Bayesian
to use your posterior distribution as input to his
or her Bayesian analysis.
• He or she would be better off with a frequentist
summary
26. 26
The Date of Information
Problem
Statistician: Here is the result of the analysis of the trial you
asked me to look at. I have added the likelihood to your prior.
This is the posterior distribution.
Physician: Excellent! Now could you please take the results
of the previous trials and do a meta-analysis?
Statistician. (after a pause) There is no need. The result I gave
you is the meta-analysis. The previous trials are in your prior.
Physician. (after a pause). If the previous trials are in my prior,
they got into my prior without your help at all. Why did I need
you to help with producing the posterior?
27. 27
The Bayesian Meta-Analyst’s Dilemma
In general Pn-1 + Dn Pn
Step 1: P0 + D1 P1,
Step 2: P1 + D2 P2 ,
Or equivalently, P0 + D1 + D2 P2
But suppose P0 already includes D1 then this analysis would
be illegitimate (like analysing 50 values using a chi-square
on the percentages).
28. 28
The Dilemma Continued
So use step 2 only. But suppose that P1 does not include D1.
This would be equivalent to analysing a contingency table of 200
observations using a chi-square on the percentages.
Then the principle of total information has been violated.
(Note, however that the principle of total information seems to be
an independent principle which cannot be derived from Bayes
rule* except by imposing very artificial additional conditions.)
* NB Bayes rule Bayes theorem
29. 29SJS Harpenden Biometric.ppt
The Difference between
Mathematical and Applied
Statistics
Mathematical statistics is full of lemmas
whereas applied statistics is full of
dilemmas.
30. 30
Conclusion
• In theory the Bayesian approach provides
a natural framework for dealing with
nuisance parameters
• In practice it is rather hard to implement
• It is difficult to know what information to
incorporate and how far to look for it
• Using uninformative independent priors for
nuisance parameters can be dangerous
31. 31
A Final Challenge
• If you don’t believe me try the following
• How would I carry out a meta-analysis of a
single trial?