SlideShare a Scribd company logo
1 of 20
Magnitude
Articulation
Generality
Interest
Credibility
*Abelson, Statistics as Principled Argument
Research by MAGIC*
Magnitude
 What’s the smallest result anyone will
care about?
Reduce the length of stay by one day?
Decrease mortality from 1% to 0.9%?
Are we trying to prove that there is
a meaningful difference, or that any
difference is too small to care
about?
Articulation – What’s the Story?
Variable(s) of
Primary interest
Outcomes
Continous:
Length of Stay
Pain scores
Events:
Infection
DVTs
Confounding variables
May be demographics
or comorbidities
Known or reasonably
expected to affect
outcome
Not all outcomes can be neatly
measured as discrete events or
physical units (pain, disability…)
Not all measurable variables may
be confounders. Only control or
match for ones you are sure of.
Articulation – tell the complete
story (using all relevant variables)
Articulation – a clear story
 Tell as much of your story as you can
using graphs and tables. Clinicians are
a visual audience
 Can you explain how variables may
interact to produce the observed results?
 Can you explain to a clinician (insurer,
administrator, patient…) what the result
means?
Articulation – telling the right story
Straight
line with
error
Nonlinear,
no error
No error, but
outlier
No result except
for outlier?
All of these have same regression line and R2
Generalizable
Who will be able to benefit from the
results of your study?
All surgeons and patients?
A subset such as:
 Urban or rural locations?
 Older or younger patients?
An infrequent result (5-10% of cases?)
Something so rare a surgeon may
never see it?
Generality
ALL RETROSPECTIVE STUDIES
ARE EXPLORATORY!
Without comparing to another data set, you can’t
confirm
GROUPS DEFINED BY THE
OUTCOMES SHOULD BE
SUSPECT!
Your data set should not drive the analysis
Interesting
"Not everything that counts can be
counted, and not everything that can
be counted, counts."Einstein on
endpoints.
Is this new information?
Is this useful?(see also:
Generalizable)
Is this something you yourself
would want to read about on
your own?
Credibility - Data ain’t fish!
 You can make tasty
imitation crabmeat,
shrimp, etc. by
mixing together
cheaper fish and
seasoning.
 You can NOT pull
the same trick with
data.
 Collect it right the
https://en.wikipedia.org/wiki/Crab_stick
Rosenwasser’s Special Case
“Meta-Analysis is to Analysis
what Metaphysics is to
Physics.”
Robert H. Rosenwasser, MD, FACS,
FAHA
A special case of “data ain’t fish”
Good studies + bad studies do not equal good on
average
Credibility – Prospective Studies
A 22-item
checklist for good
reporting of a
randomized
controlled trial is
available at
www.consort-statement.org
Why Randomize?
If you don’t know what other
factors affect the result, you can
at least be confident they’re the
same in all groups.
Credibility – Retrospective
Studies
Bradford Hill’s nine criteria for causality
 Strength of Association
 Consistency with Prior Knowledge
 Specificity (more causes, less specific)
 Temporal relationship – cause before effect
 Dose response – more exposure, greater odds
 Plausibility – existing theory linking cause + effect
 Coherence – does not contradict existing knowledge
 Experimental evidence (such as animal studies)
 Analogy – parallels other known cause-effect association
Presence doesn’t prove, absence doesn’t disprove,
but each one helps.
Credibility: Math problem
If the Type I error is limited to 5% then we expect
one false positive out of 20 different tests where the
null hypothesis is true.
These could be:
 20 different studies from the same person
 20 different sites attempting the same study
One study containing 20 different tests
This last case is the only one under our control
Correcting for multiple tests
In both one-tailed and two-tailed
tests, the total Type I error
probability (area in red) sums up
to a.
In two-tailed tests, the error is
divided between a /2 for two
possibilities.
Bonferroni and other corrections
for multiple tests also divide up the
Type I error between tests.
Bonferroni divides up a among N
tests as a /N.
This correction protects against inflated type I error
Intention to Treat
In randomized studies, analysis must always be based
on the group patients were assigned to, even if they
cross over.
 This prevents bias. For example, patients assigned to
a non-operative group may still be given surgery, but
operative patients can’t cross over to non-operative.
 Patients having more trouble with one treatment may
be more likely to cross over or drop out
 The intention to treat analysis doesn’t ask whether the
treatment is effective; it asks whether the policy of
assigning a patient to the treatment is effective.
Six Ways to p-Hack
(list from Leif D. Nelson, Berkeley Initiative for Transparency in the Social Sciences)
 Stop collecting data once p<.05
 Analyze many measures, but report only those with
p<.05.
 Collect and analyze many conditions, but only report
those with p<.05.
 Use covariates to get p<.05.
 Exclude participants to get p<.05.
Goodhart’s Law: When a
measure becomes a
target, it ceases to be a
good measure
Male Age
(years)
Implant Ever
Smoked?
Disability
(%)
1 45 Brass 1 75
0 30 Ceramic 1 45
0 . Ceramic 0 30
1 56 Brass 0 50
0 . Brass 1 50
Sex Age
(years)
Implant Smoker Disability
(%)
M 45 Acme
Brass
Y 75
f 30 Presto
Ceramic
2
packs/day
45%
Y N/A Zenith
Ceramic
No 0.3
male 56 Delta
Brass
NO 50
F ? Metal Sometime
s
half
COLLECT DATA CONSISTENTLY
Revision required
before analysis is
practical.
The same data, clearly
coded with minimal
chance of error.
Useful Cynicism from
Statisticians
 All models are wrong, but some are useful. (George E. P.
Box)
 An approximate answer to the right problem is worth a good
deal more than an exact answer to an approximate
problem. (John Tukey)
 The combination of some data and an aching desire for an
answer does not ensure that a reasonable answer can be
extracted from a given body of data. (also John Tukey)
 To call in the statistician after the experiment is done may
be no more than asking him to perform a post-mortem
Also remember:
 People who interview you – whether
hiring committees or patients – are
going to remember whether you spoke
with depth, insight and enthusiasm.
 The difference between good medicine
and no medicine is generally smaller
than the difference between good
medicine and bad medicine. Caution
and skepticism help prevent getting bad
medicine out there.

More Related Content

What's hot

In search of the lost loss function
In search of the lost loss function In search of the lost loss function
In search of the lost loss function Stephen Senn
 
Vaccine trials in the age of COVID-19
Vaccine trials in the age of COVID-19Vaccine trials in the age of COVID-19
Vaccine trials in the age of COVID-19Stephen Senn
 
NNTs, responder analysis & overlap measures
NNTs, responder analysis & overlap measuresNNTs, responder analysis & overlap measures
NNTs, responder analysis & overlap measuresStephen Senn
 
Seventy years of RCTs
Seventy years of RCTsSeventy years of RCTs
Seventy years of RCTsStephen Senn
 
Is ignorance bliss
Is ignorance blissIs ignorance bliss
Is ignorance blissStephen Senn
 
What should we expect from reproducibiliry
What should we expect from reproducibiliryWhat should we expect from reproducibiliry
What should we expect from reproducibiliryStephen Senn
 
Real world modified
Real world modifiedReal world modified
Real world modifiedStephen Senn
 
Thinking statistically v3
Thinking statistically v3Thinking statistically v3
Thinking statistically v3Stephen Senn
 
The revenge of RA Fisher
The revenge of RA Fisher The revenge of RA Fisher
The revenge of RA Fisher Stephen Senn
 
The Rothamsted school meets Lord's paradox
The Rothamsted school meets Lord's paradoxThe Rothamsted school meets Lord's paradox
The Rothamsted school meets Lord's paradoxStephen Senn
 
Placebos in medical research
Placebos in medical researchPlacebos in medical research
Placebos in medical researchStephen Senn
 
What is your question
What is your questionWhat is your question
What is your questionStephenSenn2
 
Personalised medicine a sceptical view
Personalised medicine a sceptical viewPersonalised medicine a sceptical view
Personalised medicine a sceptical viewStephen Senn
 
De Finetti meets Popper
De Finetti meets PopperDe Finetti meets Popper
De Finetti meets PopperStephen Senn
 
Clinical trials: three statistical traps for the unwary
Clinical trials: three statistical traps for the unwaryClinical trials: three statistical traps for the unwary
Clinical trials: three statistical traps for the unwaryStephen Senn
 
Clinical trials: quo vadis in the age of covid?
Clinical trials: quo vadis in the age of covid?Clinical trials: quo vadis in the age of covid?
Clinical trials: quo vadis in the age of covid?Stephen Senn
 
Has modelling killed randomisation inference frankfurt
Has modelling killed randomisation inference frankfurtHas modelling killed randomisation inference frankfurt
Has modelling killed randomisation inference frankfurtStephen Senn
 

What's hot (20)

In search of the lost loss function
In search of the lost loss function In search of the lost loss function
In search of the lost loss function
 
Vaccine trials in the age of COVID-19
Vaccine trials in the age of COVID-19Vaccine trials in the age of COVID-19
Vaccine trials in the age of COVID-19
 
NNTs, responder analysis & overlap measures
NNTs, responder analysis & overlap measuresNNTs, responder analysis & overlap measures
NNTs, responder analysis & overlap measures
 
Seventy years of RCTs
Seventy years of RCTsSeventy years of RCTs
Seventy years of RCTs
 
Is ignorance bliss
Is ignorance blissIs ignorance bliss
Is ignorance bliss
 
What should we expect from reproducibiliry
What should we expect from reproducibiliryWhat should we expect from reproducibiliry
What should we expect from reproducibiliry
 
Real world modified
Real world modifiedReal world modified
Real world modified
 
Thinking statistically v3
Thinking statistically v3Thinking statistically v3
Thinking statistically v3
 
The revenge of RA Fisher
The revenge of RA Fisher The revenge of RA Fisher
The revenge of RA Fisher
 
The Rothamsted school meets Lord's paradox
The Rothamsted school meets Lord's paradoxThe Rothamsted school meets Lord's paradox
The Rothamsted school meets Lord's paradox
 
Placebos in medical research
Placebos in medical researchPlacebos in medical research
Placebos in medical research
 
What is your question
What is your questionWhat is your question
What is your question
 
Personalised medicine a sceptical view
Personalised medicine a sceptical viewPersonalised medicine a sceptical view
Personalised medicine a sceptical view
 
De Finetti meets Popper
De Finetti meets PopperDe Finetti meets Popper
De Finetti meets Popper
 
Experimental Causal Inference
Experimental Causal InferenceExperimental Causal Inference
Experimental Causal Inference
 
Clinical trials: three statistical traps for the unwary
Clinical trials: three statistical traps for the unwaryClinical trials: three statistical traps for the unwary
Clinical trials: three statistical traps for the unwary
 
Yates and cochran
Yates and cochranYates and cochran
Yates and cochran
 
Clinical trials: quo vadis in the age of covid?
Clinical trials: quo vadis in the age of covid?Clinical trials: quo vadis in the age of covid?
Clinical trials: quo vadis in the age of covid?
 
On being Bayesian
On being BayesianOn being Bayesian
On being Bayesian
 
Has modelling killed randomisation inference frankfurt
Has modelling killed randomisation inference frankfurtHas modelling killed randomisation inference frankfurt
Has modelling killed randomisation inference frankfurt
 

Similar to Research by MAGIC

Sample Size Estimation and Statistical Test Selection
Sample Size Estimation  and Statistical Test SelectionSample Size Estimation  and Statistical Test Selection
Sample Size Estimation and Statistical Test SelectionVaggelis Vergoulas
 
Chapter 3 part1-Design of Experiments
Chapter 3 part1-Design of ExperimentsChapter 3 part1-Design of Experiments
Chapter 3 part1-Design of Experimentsnszakir
 
How to read a paper
How to read a paperHow to read a paper
How to read a paperfaheta
 
Barbara Osimani, Problems with Evidence of Pharmaceutical Harm. King's Colleg...
Barbara Osimani, Problems with Evidence of Pharmaceutical Harm. King's Colleg...Barbara Osimani, Problems with Evidence of Pharmaceutical Harm. King's Colleg...
Barbara Osimani, Problems with Evidence of Pharmaceutical Harm. King's Colleg...Barbara Osimani
 
Critical Appriaisal Skills Basic 1 | May 4th 2011
Critical Appriaisal Skills Basic 1 | May 4th 2011Critical Appriaisal Skills Basic 1 | May 4th 2011
Critical Appriaisal Skills Basic 1 | May 4th 2011NES
 
Common Statistical Concerns in Clinical Trials
Common Statistical Concerns in Clinical TrialsCommon Statistical Concerns in Clinical Trials
Common Statistical Concerns in Clinical TrialsClin Plus
 
P-values the gold measure of statistical validity are not as reliable as many...
P-values the gold measure of statistical validity are not as reliable as many...P-values the gold measure of statistical validity are not as reliable as many...
P-values the gold measure of statistical validity are not as reliable as many...David Pratap
 
Development and evaluation of prediction models: pitfalls and solutions (Part...
Development and evaluation of prediction models: pitfalls and solutions (Part...Development and evaluation of prediction models: pitfalls and solutions (Part...
Development and evaluation of prediction models: pitfalls and solutions (Part...BenVanCalster
 
The ABC of Evidence-Base Medicine
The ABC of Evidence-Base MedicineThe ABC of Evidence-Base Medicine
The ABC of Evidence-Base MedicineDr Max Mongelli
 
Critical appraisal: How to read a scientific paper?
Critical appraisal: How to read a scientific paper?Critical appraisal: How to read a scientific paper?
Critical appraisal: How to read a scientific paper?Mohammed Abd El Wadood
 
Surviving statistics lecture 1
Surviving statistics lecture 1Surviving statistics lecture 1
Surviving statistics lecture 1MikeBlyth
 
Evaluating the Medical Literature
Evaluating the Medical LiteratureEvaluating the Medical Literature
Evaluating the Medical LiteratureClista Clanton
 
Causal inference lecture to Texas Children's fellows
Causal inference lecture to Texas Children's fellowsCausal inference lecture to Texas Children's fellows
Causal inference lecture to Texas Children's fellowsPavlos Msaouel, MD, PhD
 
A Lenda do Valor P
A Lenda do Valor PA Lenda do Valor P
A Lenda do Valor PFUAD HAZIME
 
Lessons learned in polygenic risk research | Grand Rapids, MI 2019
Lessons learned in polygenic risk research | Grand Rapids, MI 2019Lessons learned in polygenic risk research | Grand Rapids, MI 2019
Lessons learned in polygenic risk research | Grand Rapids, MI 2019Cecile Janssens
 
Analytic Methods and Issues in CER from Observational Data
Analytic Methods and Issues in CER from Observational DataAnalytic Methods and Issues in CER from Observational Data
Analytic Methods and Issues in CER from Observational DataCTSI at UCSF
 
The Research Process
The Research ProcessThe Research Process
The Research ProcessK. Challinor
 
Dr. RM Pandey -Importance of Biostatistics in Biomedical Research.pptx
Dr. RM Pandey -Importance of Biostatistics in Biomedical Research.pptxDr. RM Pandey -Importance of Biostatistics in Biomedical Research.pptx
Dr. RM Pandey -Importance of Biostatistics in Biomedical Research.pptxPriyankaSharma89719
 

Similar to Research by MAGIC (20)

Sample Size Estimation and Statistical Test Selection
Sample Size Estimation  and Statistical Test SelectionSample Size Estimation  and Statistical Test Selection
Sample Size Estimation and Statistical Test Selection
 
Chapter 3 part1-Design of Experiments
Chapter 3 part1-Design of ExperimentsChapter 3 part1-Design of Experiments
Chapter 3 part1-Design of Experiments
 
How to read a paper
How to read a paperHow to read a paper
How to read a paper
 
Barbara Osimani, Problems with Evidence of Pharmaceutical Harm. King's Colleg...
Barbara Osimani, Problems with Evidence of Pharmaceutical Harm. King's Colleg...Barbara Osimani, Problems with Evidence of Pharmaceutical Harm. King's Colleg...
Barbara Osimani, Problems with Evidence of Pharmaceutical Harm. King's Colleg...
 
Critical Appriaisal Skills Basic 1 | May 4th 2011
Critical Appriaisal Skills Basic 1 | May 4th 2011Critical Appriaisal Skills Basic 1 | May 4th 2011
Critical Appriaisal Skills Basic 1 | May 4th 2011
 
Common Statistical Concerns in Clinical Trials
Common Statistical Concerns in Clinical TrialsCommon Statistical Concerns in Clinical Trials
Common Statistical Concerns in Clinical Trials
 
P-values the gold measure of statistical validity are not as reliable as many...
P-values the gold measure of statistical validity are not as reliable as many...P-values the gold measure of statistical validity are not as reliable as many...
P-values the gold measure of statistical validity are not as reliable as many...
 
Basics of Research and Bias
Basics of Research and BiasBasics of Research and Bias
Basics of Research and Bias
 
Development and evaluation of prediction models: pitfalls and solutions (Part...
Development and evaluation of prediction models: pitfalls and solutions (Part...Development and evaluation of prediction models: pitfalls and solutions (Part...
Development and evaluation of prediction models: pitfalls and solutions (Part...
 
The ABC of Evidence-Base Medicine
The ABC of Evidence-Base MedicineThe ABC of Evidence-Base Medicine
The ABC of Evidence-Base Medicine
 
Critical appraisal: How to read a scientific paper?
Critical appraisal: How to read a scientific paper?Critical appraisal: How to read a scientific paper?
Critical appraisal: How to read a scientific paper?
 
Surviving statistics lecture 1
Surviving statistics lecture 1Surviving statistics lecture 1
Surviving statistics lecture 1
 
Evaluating the Medical Literature
Evaluating the Medical LiteratureEvaluating the Medical Literature
Evaluating the Medical Literature
 
Causal inference lecture to Texas Children's fellows
Causal inference lecture to Texas Children's fellowsCausal inference lecture to Texas Children's fellows
Causal inference lecture to Texas Children's fellows
 
A Lenda do Valor P
A Lenda do Valor PA Lenda do Valor P
A Lenda do Valor P
 
Lessons learned in polygenic risk research | Grand Rapids, MI 2019
Lessons learned in polygenic risk research | Grand Rapids, MI 2019Lessons learned in polygenic risk research | Grand Rapids, MI 2019
Lessons learned in polygenic risk research | Grand Rapids, MI 2019
 
Analytic Methods and Issues in CER from Observational Data
Analytic Methods and Issues in CER from Observational DataAnalytic Methods and Issues in CER from Observational Data
Analytic Methods and Issues in CER from Observational Data
 
The Research Process
The Research ProcessThe Research Process
The Research Process
 
Module7_RamdomError.pptx
Module7_RamdomError.pptxModule7_RamdomError.pptx
Module7_RamdomError.pptx
 
Dr. RM Pandey -Importance of Biostatistics in Biomedical Research.pptx
Dr. RM Pandey -Importance of Biostatistics in Biomedical Research.pptxDr. RM Pandey -Importance of Biostatistics in Biomedical Research.pptx
Dr. RM Pandey -Importance of Biostatistics in Biomedical Research.pptx
 

Research by MAGIC

  • 2. Magnitude  What’s the smallest result anyone will care about? Reduce the length of stay by one day? Decrease mortality from 1% to 0.9%? Are we trying to prove that there is a meaningful difference, or that any difference is too small to care about?
  • 3. Articulation – What’s the Story? Variable(s) of Primary interest Outcomes Continous: Length of Stay Pain scores Events: Infection DVTs Confounding variables May be demographics or comorbidities Known or reasonably expected to affect outcome Not all outcomes can be neatly measured as discrete events or physical units (pain, disability…) Not all measurable variables may be confounders. Only control or match for ones you are sure of.
  • 4. Articulation – tell the complete story (using all relevant variables)
  • 5. Articulation – a clear story  Tell as much of your story as you can using graphs and tables. Clinicians are a visual audience  Can you explain how variables may interact to produce the observed results?  Can you explain to a clinician (insurer, administrator, patient…) what the result means?
  • 6. Articulation – telling the right story Straight line with error Nonlinear, no error No error, but outlier No result except for outlier? All of these have same regression line and R2
  • 7. Generalizable Who will be able to benefit from the results of your study? All surgeons and patients? A subset such as:  Urban or rural locations?  Older or younger patients? An infrequent result (5-10% of cases?) Something so rare a surgeon may never see it?
  • 8. Generality ALL RETROSPECTIVE STUDIES ARE EXPLORATORY! Without comparing to another data set, you can’t confirm GROUPS DEFINED BY THE OUTCOMES SHOULD BE SUSPECT! Your data set should not drive the analysis
  • 9. Interesting "Not everything that counts can be counted, and not everything that can be counted, counts."Einstein on endpoints. Is this new information? Is this useful?(see also: Generalizable) Is this something you yourself would want to read about on your own?
  • 10. Credibility - Data ain’t fish!  You can make tasty imitation crabmeat, shrimp, etc. by mixing together cheaper fish and seasoning.  You can NOT pull the same trick with data.  Collect it right the https://en.wikipedia.org/wiki/Crab_stick
  • 11. Rosenwasser’s Special Case “Meta-Analysis is to Analysis what Metaphysics is to Physics.” Robert H. Rosenwasser, MD, FACS, FAHA A special case of “data ain’t fish” Good studies + bad studies do not equal good on average
  • 12. Credibility – Prospective Studies A 22-item checklist for good reporting of a randomized controlled trial is available at www.consort-statement.org Why Randomize? If you don’t know what other factors affect the result, you can at least be confident they’re the same in all groups.
  • 13. Credibility – Retrospective Studies Bradford Hill’s nine criteria for causality  Strength of Association  Consistency with Prior Knowledge  Specificity (more causes, less specific)  Temporal relationship – cause before effect  Dose response – more exposure, greater odds  Plausibility – existing theory linking cause + effect  Coherence – does not contradict existing knowledge  Experimental evidence (such as animal studies)  Analogy – parallels other known cause-effect association Presence doesn’t prove, absence doesn’t disprove, but each one helps.
  • 14. Credibility: Math problem If the Type I error is limited to 5% then we expect one false positive out of 20 different tests where the null hypothesis is true. These could be:  20 different studies from the same person  20 different sites attempting the same study One study containing 20 different tests This last case is the only one under our control
  • 15. Correcting for multiple tests In both one-tailed and two-tailed tests, the total Type I error probability (area in red) sums up to a. In two-tailed tests, the error is divided between a /2 for two possibilities. Bonferroni and other corrections for multiple tests also divide up the Type I error between tests. Bonferroni divides up a among N tests as a /N. This correction protects against inflated type I error
  • 16. Intention to Treat In randomized studies, analysis must always be based on the group patients were assigned to, even if they cross over.  This prevents bias. For example, patients assigned to a non-operative group may still be given surgery, but operative patients can’t cross over to non-operative.  Patients having more trouble with one treatment may be more likely to cross over or drop out  The intention to treat analysis doesn’t ask whether the treatment is effective; it asks whether the policy of assigning a patient to the treatment is effective.
  • 17. Six Ways to p-Hack (list from Leif D. Nelson, Berkeley Initiative for Transparency in the Social Sciences)  Stop collecting data once p<.05  Analyze many measures, but report only those with p<.05.  Collect and analyze many conditions, but only report those with p<.05.  Use covariates to get p<.05.  Exclude participants to get p<.05. Goodhart’s Law: When a measure becomes a target, it ceases to be a good measure
  • 18. Male Age (years) Implant Ever Smoked? Disability (%) 1 45 Brass 1 75 0 30 Ceramic 1 45 0 . Ceramic 0 30 1 56 Brass 0 50 0 . Brass 1 50 Sex Age (years) Implant Smoker Disability (%) M 45 Acme Brass Y 75 f 30 Presto Ceramic 2 packs/day 45% Y N/A Zenith Ceramic No 0.3 male 56 Delta Brass NO 50 F ? Metal Sometime s half COLLECT DATA CONSISTENTLY Revision required before analysis is practical. The same data, clearly coded with minimal chance of error.
  • 19. Useful Cynicism from Statisticians  All models are wrong, but some are useful. (George E. P. Box)  An approximate answer to the right problem is worth a good deal more than an exact answer to an approximate problem. (John Tukey)  The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. (also John Tukey)  To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem
  • 20. Also remember:  People who interview you – whether hiring committees or patients – are going to remember whether you spoke with depth, insight and enthusiasm.  The difference between good medicine and no medicine is generally smaller than the difference between good medicine and bad medicine. Caution and skepticism help prevent getting bad medicine out there.