SlideShare a Scribd company logo
1 of 104
Download to read offline
What is quantified in quantitative research,
  and how do you publish the findings?
              Jonas Ranstam PhD
Question: What is science?
Answer: Science is1 generalizable knowledge.

        generalizable = reproducible and predictive




           1. US National Science Foundation
One major problem:

Sampling uncertainty
Plan
1. Medical research and uncertainty
2. The consequence of study design
3. Publishing uncertain results
1. Medical research and uncertainty
Anecdotal evidence, case reports)
Cohort study of smoking
                 and lung cancer (1954)
                 (Bradford Hill)

        Case-control study of
        smoking and lung
        cancer (1950)                 Evaluation of
        (Bradford Hill)            sampling uncertainty
Randomised clinical
trial of streptomycin
and tubercolosis
(1948)
(Bradford Hill)
What is sampling uncertainty?
Observed
 sample    354 consecutive patients with hip fracture
           treated at the Department of Orthopedics,
           Umeå University Hospital
Unobserved
            population




           All potential hip fracture patients all over
           the world, now, earlier and future.



Observed
 sample    354 consecutive patients with hip fracture
           treated at the Department of Orthopedics,
           Umeå University Hospital
Unobserved
               population




Observed                     A third observed
 sample                           sample

               Another
           observed sample
Now consider a laboratory experiment
To what population do experiment A
belong?




               Experiment A
To what population do experiment A
belong?
               The mother of all possible realizations of

                             Experiment A




  Experiment A Experiment A Experiment A Experiment A Experiment A
To what population do experiment A
belong?
               The mother of all possible repetitions of

                            Experiment A




  Experiment A Experiment A Experiment A Experiment A Experiment A




                         Sampling variability
To what population do experiment A
belong?
               The mother of all possible repetitions of

                            Experiment A




  Experiment A Experiment A Experiment A Experiment A Experiment A




                                  μ

                         Sampling variability
What is the sampling variability of these
experiments?
               The mother of all possible repetitions of

                             Experiment A




  Experiment A Experiment A Experiment A Experiment A Experiment A




                                μ
                   Observed sampling variability after
                      thousands of experiments
Do we need to repeat each
experiment thousands of times?




                  Experiment A   SD
                                 n




                       μ

        Sampling uncertainty?
Can we say anything about sampling
uncertainty if only one experiment is
performed?



                          Experiment A   SD
                                         n



                                         SEM = SD/√n




 -1.96SEM                 +1.96SEM
        Sampling uncertainty
Now consider a ranking of hospitals
Do different ranks in league tables
represent differences in “hospital
quality”?



Hospital A   Hospital B    Hospital C         Hospital D   Hospital E




                      Sampling variability?
Or do the differences just
             reflect sampling variation?
              The mother of all possible repetitions of

                                Hospital A




Hospital A     Hospital A       Hospital A         Hospital A   Hospital A




                                    μ

                            Sampling variability
It depends on the degree of uncertainty!




 Hospital A   Hospital B    Hospital C         Hospital D   Hospital E




                       Sampling variability?                  ICC ≈ 1.0
It depends on the degree of uncertainty!




 Hospital A   Hospital B    Hospital C         Hospital D   Hospital E




                       Sampling variability?                  ICC = 0
What is the difference between
quantitative and qualitative science?

       (sampling uncertainty)
Qualitative research
Sampling uncertainty is irrelevant for the
generalization



Quantitative research
Generalization requires quantification of sampling
uncertainty
Qualitative research: 100% of all crows are black

One white crow is sufficient to refute the statement.
Quantitative research: 99% of all crows are black

All crows cannot be studied simultaneously, but the
proportion black crows can be estimated from a
random sample of crows.

Samples are characterized by sampling uncertainty.
This must be quantified to assess the empirical
support of the findings.
Journal examples
What statements describe sampling uncertainty?
Generalizable knowledge




                                        Generalization
                                           Generalization
       Observation                         Generalization

                                              Generalization


P-values and confidence intervals are
used to quantify the uncertainty.

They help us generalize.
How is the uncertainty assessed?
Statistical precision
Statistical precision depends on:

a) the variability (SD) between independent observations

b) the number (n) of independent observations


The standard error of an estimate (SE) = SD/√n


With the same variability, a greater sample size is needed to
detect a lower effect.
Example: Vaccine trial
Protection of pandemic vaccine: 30% ill without vaccine.

Sample size for max 5% risk of false positive and 20% of false
negative result.

Protection          Nr patients
 90 %                   72
 80 %                   94
 70 %                  128
 60 %                  180
 50 %                  268
 40 %                  428
Example: Observational safety study
Guillain-Barrés syndrome: Incidence = 1x10-5 pyears

Sample size for max 5% risk of false positive and 20% of false negative result

Relative risk           Nr patients            Nr affected
   100                    1 098                 9 000
     50                   2 606                 4 500
     20                   9 075                 1 800
     10                  26 366                   900
      5                  92 248                   450
      2                 992 360                   180
Statistical precision
The p-value

The probability of by chance obtaining a result at least as
extreme as that observed, when no effect exists.


If |Diffmean/SEDiff| > 1.96 then p < 0.05

and Diffmean is considered statistically significant
Statistical precision
Confidence interval

A range of values, which with specified confidence includes the
estimated population parameter.


Diffmean±1.96 SEDiff gives a 95% confidence interval
P-values are usually misconstrued
They do not

-   describe clinical relevance, because they depend on sample
    size

-   show that a difference “does not exist”, because statistical
    insignificance indicates absence of evidence, not evidence
    of absence

-   present the uncertainty in the magnitude of an effect or
    difference, because the just relate to null effect (the null
    hypothesis)
Results
There was no difference in BMI (p = 0.09), see Table 1.



Table 1     BMI (mean ±SD)

Group 1.    29.2 ±6.9
Group 2.    33.8 ±7.1
Confidence intervals are better
than p-values
In contrast to p-values they do facilitate

-   assessment of clinical significance

-   show when a difference “does not exist”,
    because they present lower and upper limits of
    potential clinical effects/differences
Results
There was a difference in BMI of 4.1 (-0.3, 9.0) kg/m2,
see Table 1.



Table 1     BMI (mean ±SD)

Group 1.    29.2 ±6.9
Group 2.    33.8 ±7.1
P-value and confidence interval
    Information in p-values          Information in confidence intervals
   [2 possibilities]                         [2 possibilities]




                      p < 0.05
                                                    Statistically significant effect


                      n.s.                Inconclusive




Effect
                                 0
P-value and confidence interval
             P-value               Conclusion from confidence intervals




                                    Statistically and clinically significant effect
                    p < 0.05

                                       Statistically, but not necessarily clinically, significant effect
                    p < 0.05



                    n.s.
                                          Inconclusive


             n.s.                    Neither statistically nor clinically significant effect


  p < 0.05                            Statistically significant reversed effect



Effect                         0
                                    Clinically significant effects
Superiority vs. non-inferiority



                                                                                   Superiority shown


                                                                      Superiority shown less strongly



Non-inferiority not shown                                       Superiority not shown


       Non-inferiority shown                                      Superiority not shown


          Equivalence shown                                Superiority not shown



     Control better                                                New agent better
                                          0

                               Margin of non-inferiority
                                   or equivalence
Science as “significant observations”


                    P < 0.05 [There is a difference]




      Data

                NS [There is no difference]
Science as “significant observations”


                     P < 0.05 [There is a difference]

                     A p-value can be meaningfully interpreted
                     only when the hypothesis is defined a
                     priori and when multiplicity issues are
                     considered.
      Data

                NS [There is no difference]

                No, statistical insignificance indicates absence of
                evidence, not evidence of absence.
Science as “significant observations”




              What should not be asked
              Is there a statistically significant difference in
              the studied group of patients?
      Data

              What should be asked
              Is there an indication of a clinically significant
              difference among patients in general?
CMAJ 1989;141:881–883.
2. The consequence of study design
Evidence based medicine
1. Strong evidence from at least one systematic review of multiple
   well-designed randomized controlled trials.

2. Strong evidence from at least one properly designed randomized
   controlled trial of appropriate size.

3. Evidence from well-designed trials such as pseudo-randomized
   or non-randomized trials, cohort studies, time series or matched
   case-controlled studies.

4. Evidence from well-designed non-experimental studies from more
   than one center or research group or from case reports.

5. Opinions of respected authorities, based on clinical evidence,
   descriptive studies or reports of expert committees.
Any claim coming from an observational
study is most likely to be wrong
12 randomised trials have tested 52 observational claims (about
the effects of vitamine B6, B12, C, D, E, beta carotene, hormone
replacement therapy, folic acid and selenium).

“They all confirmed no claims in the direction of the observational
claim. We repeat that figure: 0 out of 52. To put it in another way,
100% of the observational claims failed to replicate. In fact, five
claims (9.6%) are statistically significant in the opposite direction
to the observational claim.”



Stanley Young and Allan Karr, Significance, September 2011
Even good observational research...
A series of observational studies published in the Lancet and the
NEJM generated and tested during the 1980s the hypothesis that
AIDS was caused by the side effect of a drug (amyl nitrite).

The authors of these publications also claimed to have identified
the biological mechanism and urged for preventive measures.

Then the virus was detected.



Vandenbroucke JP and Pardoel VP. An autopsy of epidemiologic
methods: the case of “poppers” in the early epidemic of the
acquired immunodeficiency syndrome (AIDS). Am J Epidemiol
1989;129:455-457.
What is the most important methodological
  difference between observational and
          experimental studies?
Experimental vs. observational studies
Experiments

Bias is eliminated by design (“Block what you can, randomize
what you cannot”)

Statistical analysis: Focus on precision

Observation

Blocking and randomization is not possible. Bias must be taken
into consideration in the statistical analysis.

Statistical analysis: Focus on validity
Experimental studies
- Randomized clinical trials

- Laboratory experiments
Tests for baseline imbalance
Baseline imbalance after randomization is often tested. This
is not meaningful.

The purpose of randomization is to avoid systematic
imbalance (bias), not random errors (reduced precision).

The method to avoid random baseline imbalance is to use
randomization stratification.
Multiplicity
In contrast to many other forms of precision, statistical
precision depends on the number of measurements
performed (the number of hypotheses tested).

The probability of a false positive finding increases with
the number of performed tests.
Multiplicity
The risk of getting at least one false positive finding can be
calculated as 1 - (1 - α)k

where k is the number of performed comparisons and
α the significance level (usually 0.05).

Number of tests      Risk of at least one false positive

         1                      0.05
         2                      0.10
        10                      0.40
        20                      0.60
Multiplicity
Adjustments of p-values can be made, but these reduce
the type 1 error rate on the expense of the type 2 error
rate, which means that a greater patient number will be
needed, which in turn means higher cost.

Recommendation: Avoid multiplicity adjustments.

Laboratory experimenters often use Bonferroni correction
to address multiplicity issues within endpoints, but hardly
ever to correct for the multiplicity of endpoints. The work is
therefore hypothesis generating rather than confirmatory.
Statistical analyses
Type of test   Result

Confirmatory   Empirical support for a claim of superiority,
               equivalence or non-inferiority.

Hypothesis     A new hypothesis, which needs to be tested
generating     in a new hypothesis test.
How can I avoid multiplicity adjustments?

Most trials include more than 1 outcome.

Define a structure or hierarchy of endpoints: primary, secondary
and safety. Define primary endpoint(s) as confirmatory and
secondary as hypothesis generating.

No adjustment is necessary when statistical significance is
required for all multiplicities or for supporting or explanatory
hypothesis tests.
Endpoints
Primary     The variable capable of providing the
            most clinically relevant evidence
            directly related to the primary objective
            of the trial

Secondary Effects related to secondary objectives,
          measurements supporting primary
          endpoint(s) or hypothesis generating tests.
Validity issues in randomized trials

External validity

Inclusion/exclusion criteria affects the representativity of the
results (efficacy vs. effectiveness).

Internal validity

Some subjects withdraw, from follow up. The withdrawal may
depend on treatment and on the patient's characteristics.
This can bias both efficacy and effectiveness.
Study populations
Intention-to-treat   Analyze all randomized subjects
(ITT) principle      according to randomized treatment.

Full analysis set    The set of subjects that is as close
(FAS)                as possible to the ideal implied by
                     the ITT-principle.

Per protocol         The set of subjects who complied
(PP) set             with the protocol sufficiently to ensure
                     that they are likely to exhibit the
                     effects of treatment according to the
                     underlying scientific model.
FAS vs. PP-set
FAS        +   no selection bias
           -   misclassification problem (effect dilution)

PP-set     +   no contamination problem
           -   possible selection bias (confounding)


When the FAS and PP-set lead to essentially the same
conclusions, confidence in the trial is supported.
Clinical trials
International regulatory guidelines

ICH Topic E9 - Statistical Principles for Clinical Trials

EMEA Points to consider: baseline covariates
                - missing data
                - multiplicity issues
                - etc.

and similar documents from the FDA

These guidelines can all be found on the internet.
Observational studies
Main types

- Cross-sectional studies

- Cohort studies (prospective or historic)

- Case-control studies (always retrospective)
Observational studies
Validity
 Selection bias     (systematic differences between
                     comparison groups caused by
                     non-random allocation of subjects)

 Information bias   (misclassification, measurement
                     errors, etc.)

 Confounding bias   (inadequate analysis, flawed
                     interpretation of results)
Testing for confounding
Screening for statistically significant effects, or stepwise
regression, is often used to select covariates for inclusion in
a regression model.

However, confounding is a property of the sample, not of the
population. Hypothesis tests have no relevance.

The selection of covariates to adjust for must be based on
clinical knowledge and considerations of cause and effect.
All study designs are (more or less) problematic
Observational studies
  - Post hoc hypothesis tests, multiple testing
  - Multiple modeling, protopatic bias, confounding
  - Recycling of data

Experimental studies (laboratory experiments)
  - Multiple testing (Bonferroni correction within endpoints)
  - Small sample problems (often n=3)
  - Pseudoreplication and pooling of samples

Experimental studies (randomized clinical trials)
  - External validity
  - No long term effects
  - No infrequent events
Independent observations and replicates


                                Two rats are sampled
                                from a population with
                                a mean (μ) of 50 and
                                a standard deviation
                                (σ) of 10, and ten
                                measurements of an
                                arbitrary outcome
                                variable are made on
                                each rat.
3. Publishing uncertain results
A scientific report
The idea is to try and give all the information to help others to
judge the value of your contributions, not just the information
that leads to judgment in one particular direction or another.

                                          Richard P. Feynman
It is impossible to do clinical research
so badly that it cannot be published
“There seems to be no study too fragmented, no hypothesis too
trivial, no literature citation too biased or too egotistical, no design
too warped, no methodology too bungled, no presentation of
results too inaccurate, no argument too circular, no conclusions
too trifling or too unjustified, and no grammar and syntax too
offensive for a paper to end up in print.”




Drummond Rennie 1986 (editor of NEJM and JAMA)
Changes in publication practice
1658 – first scientific journals
1858 – the IMRAD structure
1957 – the abstract
1978 – Vancouver convention (ICMJE)
1987 – the structured abstract

Randomized clinical trials
1997 – Reporting guidelines (CONSORT)
1998 – Analysis guidelines (ICH)
2005 – Trial registration (Clinicaltrials.gov)

Observational studies
2007 – Reporting guidelines (STROBE)
2011 – Analysis guidelines (NARA, ICRS, etc.)
Clinical Trial Registration
In this editorial, published simultaneously in all member journals, the
International Committee of Medical Journal Editors (ICMJE)
proposes comprehensive trials registration as a solution to the
problem of selective awareness and announces that all 11 ICMJE
member journals will adopt a trials-registration policy to promote this
goal.

The ICMJE member journals will require, as a condition of
consideration for publication, registration in a public trials registry.
Trials must register at or before the onset of patient enrollment. This
policy applies to any clinical trial starting enrollment after July 1,
2005. For trials that began enrollment prior to this date, the ICMJE
member journals will require registration by September 13, 2005,
before considering the trial for publication. We speak only for
ourselves, but we encourage editors of other biomedical journals to
adopt similar policies.
Thank you for your attention!

More Related Content

Viewers also liked

Mind, Stress And Relallxation Health 2
Mind, Stress And Relallxation  Health 2Mind, Stress And Relallxation  Health 2
Mind, Stress And Relallxation Health 2duane francis
 
Atomic Business Overview 2009 Linkin
Atomic Business Overview 2009 LinkinAtomic Business Overview 2009 Linkin
Atomic Business Overview 2009 Linkinguestbf78f8b
 
Using evaluationtoenhancelearning pt copy
Using evaluationtoenhancelearning pt copyUsing evaluationtoenhancelearning pt copy
Using evaluationtoenhancelearning pt copygrodrigo
 
Advancing Learning: Our Adventure in the Twitterverse
Advancing Learning: Our Adventure in the TwitterverseAdvancing Learning: Our Adventure in the Twitterverse
Advancing Learning: Our Adventure in the Twitterversegrodrigo
 
Mind, Stress And Relallxation Health 2
Mind, Stress And Relallxation  Health 2Mind, Stress And Relallxation  Health 2
Mind, Stress And Relallxation Health 2duane francis
 
Mood Board P1 06
Mood Board P1 06Mood Board P1 06
Mood Board P1 06p106
 
The Evolving Internet Fndtn
The Evolving Internet FndtnThe Evolving Internet Fndtn
The Evolving Internet Fndtnguestbf78f8b
 
The Future Of War: U.S. National Security in the 21st Century
The Future Of War:  U.S. National Security in the 21st CenturyThe Future Of War:  U.S. National Security in the 21st Century
The Future Of War: U.S. National Security in the 21st CenturyDavid Williams
 
1malaysiafor Slideshare
1malaysiafor Slideshare1malaysiafor Slideshare
1malaysiafor SlideshareIbrahim Rahman
 
Destrezas pensamiento Robert Swartz smconectados
Destrezas pensamiento Robert Swartz smconectadosDestrezas pensamiento Robert Swartz smconectados
Destrezas pensamiento Robert Swartz smconectadosJosé Carlos Sancho
 
20110711 resume
20110711 resume20110711 resume
20110711 resumeknarimat
 

Viewers also liked (20)

Mind, Stress And Relallxation Health 2
Mind, Stress And Relallxation  Health 2Mind, Stress And Relallxation  Health 2
Mind, Stress And Relallxation Health 2
 
Oac guidelines
Oac guidelinesOac guidelines
Oac guidelines
 
Atomic Business Overview 2009 Linkin
Atomic Business Overview 2009 LinkinAtomic Business Overview 2009 Linkin
Atomic Business Overview 2009 Linkin
 
Lecture jr
Lecture jrLecture jr
Lecture jr
 
Using evaluationtoenhancelearning pt copy
Using evaluationtoenhancelearning pt copyUsing evaluationtoenhancelearning pt copy
Using evaluationtoenhancelearning pt copy
 
Vicky
VickyVicky
Vicky
 
Advancing Learning: Our Adventure in the Twitterverse
Advancing Learning: Our Adventure in the TwitterverseAdvancing Learning: Our Adventure in the Twitterverse
Advancing Learning: Our Adventure in the Twitterverse
 
Mind, Stress And Relallxation Health 2
Mind, Stress And Relallxation  Health 2Mind, Stress And Relallxation  Health 2
Mind, Stress And Relallxation Health 2
 
Nara guidelines-jr
Nara guidelines-jrNara guidelines-jr
Nara guidelines-jr
 
Brussels 2010
Brussels 2010Brussels 2010
Brussels 2010
 
Vicky
VickyVicky
Vicky
 
Mood Board P1 06
Mood Board P1 06Mood Board P1 06
Mood Board P1 06
 
The Evolving Internet Fndtn
The Evolving Internet FndtnThe Evolving Internet Fndtn
The Evolving Internet Fndtn
 
London 2008
London 2008London 2008
London 2008
 
The Future Of War: U.S. National Security in the 21st Century
The Future Of War:  U.S. National Security in the 21st CenturyThe Future Of War:  U.S. National Security in the 21st Century
The Future Of War: U.S. National Security in the 21st Century
 
Odense 2010
Odense 2010Odense 2010
Odense 2010
 
1malaysiafor Slideshare
1malaysiafor Slideshare1malaysiafor Slideshare
1malaysiafor Slideshare
 
Destrezas pensamiento Robert Swartz smconectados
Destrezas pensamiento Robert Swartz smconectadosDestrezas pensamiento Robert Swartz smconectados
Destrezas pensamiento Robert Swartz smconectados
 
20110711 resume
20110711 resume20110711 resume
20110711 resume
 
Amiqus Games Information
Amiqus Games InformationAmiqus Games Information
Amiqus Games Information
 

Similar to Umeapresjr

Statistics tests and Probablity
Statistics tests and ProbablityStatistics tests and Probablity
Statistics tests and ProbablityAbdul Wasay Baloch
 
Diagnotic and screening tests
Diagnotic and screening testsDiagnotic and screening tests
Diagnotic and screening testsjfwilson2
 
Review Z Test Ci 1
Review Z Test Ci 1Review Z Test Ci 1
Review Z Test Ci 1shoffma5
 
Test of-significance : Z test , Chi square test
Test of-significance : Z test , Chi square testTest of-significance : Z test , Chi square test
Test of-significance : Z test , Chi square testdr.balan shaikh
 
P-values the gold measure of statistical validity are not as reliable as many...
P-values the gold measure of statistical validity are not as reliable as many...P-values the gold measure of statistical validity are not as reliable as many...
P-values the gold measure of statistical validity are not as reliable as many...David Pratap
 
Epidemiological statistics
Epidemiological statisticsEpidemiological statistics
Epidemiological statisticsGarima Aggarwal
 
statistics for MEDICAL RESEARH.... .pptx
statistics for MEDICAL RESEARH.... .pptxstatistics for MEDICAL RESEARH.... .pptx
statistics for MEDICAL RESEARH.... .pptxdhivyaramesh95
 
Lecture2 hypothesis testing
Lecture2 hypothesis testingLecture2 hypothesis testing
Lecture2 hypothesis testingo_devinyak
 
RMH Concise Revision Guide - the Basics of EBM
RMH Concise Revision Guide -  the Basics of EBMRMH Concise Revision Guide -  the Basics of EBM
RMH Concise Revision Guide - the Basics of EBMAyselTuracli
 

Similar to Umeapresjr (20)

Lund 2009
Lund 2009Lund 2009
Lund 2009
 
Testing Hypothesis
Testing HypothesisTesting Hypothesis
Testing Hypothesis
 
Statistics tests and Probablity
Statistics tests and ProbablityStatistics tests and Probablity
Statistics tests and Probablity
 
Hypothesis - Biostatistics
Hypothesis - BiostatisticsHypothesis - Biostatistics
Hypothesis - Biostatistics
 
Copenhagen 2008
Copenhagen 2008Copenhagen 2008
Copenhagen 2008
 
Diagnotic and screening tests
Diagnotic and screening testsDiagnotic and screening tests
Diagnotic and screening tests
 
Statistics
StatisticsStatistics
Statistics
 
Biostatistics
BiostatisticsBiostatistics
Biostatistics
 
Review Z Test Ci 1
Review Z Test Ci 1Review Z Test Ci 1
Review Z Test Ci 1
 
How to do the maths
How to do the mathsHow to do the maths
How to do the maths
 
Test of-significance : Z test , Chi square test
Test of-significance : Z test , Chi square testTest of-significance : Z test , Chi square test
Test of-significance : Z test , Chi square test
 
P-values the gold measure of statistical validity are not as reliable as many...
P-values the gold measure of statistical validity are not as reliable as many...P-values the gold measure of statistical validity are not as reliable as many...
P-values the gold measure of statistical validity are not as reliable as many...
 
Why to know statistics
Why to know statisticsWhy to know statistics
Why to know statistics
 
Epidemiological statistics
Epidemiological statisticsEpidemiological statistics
Epidemiological statistics
 
statistics for MEDICAL RESEARH.... .pptx
statistics for MEDICAL RESEARH.... .pptxstatistics for MEDICAL RESEARH.... .pptx
statistics for MEDICAL RESEARH.... .pptx
 
Hypo
HypoHypo
Hypo
 
Biostatistics ii4june
Biostatistics ii4juneBiostatistics ii4june
Biostatistics ii4june
 
Lecture2 hypothesis testing
Lecture2 hypothesis testingLecture2 hypothesis testing
Lecture2 hypothesis testing
 
Displaying your results
Displaying your resultsDisplaying your results
Displaying your results
 
RMH Concise Revision Guide - the Basics of EBM
RMH Concise Revision Guide -  the Basics of EBMRMH Concise Revision Guide -  the Basics of EBM
RMH Concise Revision Guide - the Basics of EBM
 

More from Jonas Ranstam PhD (20)

The SPSS-effect on medical research
The SPSS-effect on medical researchThe SPSS-effect on medical research
The SPSS-effect on medical research
 
Sof stat issues_pro
Sof stat issues_proSof stat issues_pro
Sof stat issues_pro
 
Rcsyd pres nara
Rcsyd pres naraRcsyd pres nara
Rcsyd pres nara
 
Prague 2008
Prague 2008Prague 2008
Prague 2008
 
Oarsi jr1
Oarsi jr1Oarsi jr1
Oarsi jr1
 
Oac beijing jr
Oac beijing jrOac beijing jr
Oac beijing jr
 
Norsminde 2009
Norsminde 2009Norsminde 2009
Norsminde 2009
 
Malmo 30 03-2012
Malmo 30 03-2012Malmo 30 03-2012
Malmo 30 03-2012
 
Lund 2010
Lund 2010Lund 2010
Lund 2010
 
Karlskrona 2009
Karlskrona 2009Karlskrona 2009
Karlskrona 2009
 
Datavalidering jr1
Datavalidering jr1Datavalidering jr1
Datavalidering jr1
 
Amsterdam 2008
Amsterdam 2008Amsterdam 2008
Amsterdam 2008
 
Actalecturerungsted
ActalecturerungstedActalecturerungsted
Actalecturerungsted
 
Stockholm 6 7.11.2008
Stockholm 6 7.11.2008Stockholm 6 7.11.2008
Stockholm 6 7.11.2008
 
Prague 02.10.2008
Prague 02.10.2008Prague 02.10.2008
Prague 02.10.2008
 
Malmo 17.10.2008
Malmo 17.10.2008Malmo 17.10.2008
Malmo 17.10.2008
 
Malmo 11.11.2008
Malmo 11.11.2008Malmo 11.11.2008
Malmo 11.11.2008
 
Lund 30.09.2008
Lund 30.09.2008Lund 30.09.2008
Lund 30.09.2008
 
London 21.11.2008
London 21.11.2008London 21.11.2008
London 21.11.2008
 
Amsterdam 11.06.2008
Amsterdam 11.06.2008Amsterdam 11.06.2008
Amsterdam 11.06.2008
 

Umeapresjr

  • 1. What is quantified in quantitative research, and how do you publish the findings? Jonas Ranstam PhD
  • 2. Question: What is science?
  • 3. Answer: Science is1 generalizable knowledge. generalizable = reproducible and predictive 1. US National Science Foundation
  • 5. Plan 1. Medical research and uncertainty 2. The consequence of study design 3. Publishing uncertain results
  • 6. 1. Medical research and uncertainty
  • 8.
  • 9. Cohort study of smoking and lung cancer (1954) (Bradford Hill) Case-control study of smoking and lung cancer (1950) Evaluation of (Bradford Hill) sampling uncertainty Randomised clinical trial of streptomycin and tubercolosis (1948) (Bradford Hill)
  • 10. What is sampling uncertainty?
  • 11. Observed sample 354 consecutive patients with hip fracture treated at the Department of Orthopedics, Umeå University Hospital
  • 12. Unobserved population All potential hip fracture patients all over the world, now, earlier and future. Observed sample 354 consecutive patients with hip fracture treated at the Department of Orthopedics, Umeå University Hospital
  • 13. Unobserved population Observed A third observed sample sample Another observed sample
  • 14. Now consider a laboratory experiment
  • 15. To what population do experiment A belong? Experiment A
  • 16. To what population do experiment A belong? The mother of all possible realizations of Experiment A Experiment A Experiment A Experiment A Experiment A Experiment A
  • 17. To what population do experiment A belong? The mother of all possible repetitions of Experiment A Experiment A Experiment A Experiment A Experiment A Experiment A Sampling variability
  • 18. To what population do experiment A belong? The mother of all possible repetitions of Experiment A Experiment A Experiment A Experiment A Experiment A Experiment A μ Sampling variability
  • 19. What is the sampling variability of these experiments? The mother of all possible repetitions of Experiment A Experiment A Experiment A Experiment A Experiment A Experiment A μ Observed sampling variability after thousands of experiments
  • 20. Do we need to repeat each experiment thousands of times? Experiment A SD n μ Sampling uncertainty?
  • 21. Can we say anything about sampling uncertainty if only one experiment is performed? Experiment A SD n SEM = SD/√n -1.96SEM +1.96SEM Sampling uncertainty
  • 22. Now consider a ranking of hospitals
  • 23. Do different ranks in league tables represent differences in “hospital quality”? Hospital A Hospital B Hospital C Hospital D Hospital E Sampling variability?
  • 24. Or do the differences just reflect sampling variation? The mother of all possible repetitions of Hospital A Hospital A Hospital A Hospital A Hospital A Hospital A μ Sampling variability
  • 25. It depends on the degree of uncertainty! Hospital A Hospital B Hospital C Hospital D Hospital E Sampling variability? ICC ≈ 1.0
  • 26. It depends on the degree of uncertainty! Hospital A Hospital B Hospital C Hospital D Hospital E Sampling variability? ICC = 0
  • 27. What is the difference between quantitative and qualitative science? (sampling uncertainty)
  • 28. Qualitative research Sampling uncertainty is irrelevant for the generalization Quantitative research Generalization requires quantification of sampling uncertainty
  • 29. Qualitative research: 100% of all crows are black One white crow is sufficient to refute the statement.
  • 30. Quantitative research: 99% of all crows are black All crows cannot be studied simultaneously, but the proportion black crows can be estimated from a random sample of crows. Samples are characterized by sampling uncertainty. This must be quantified to assess the empirical support of the findings.
  • 32.
  • 33.
  • 34.
  • 35. What statements describe sampling uncertainty?
  • 36.
  • 37.
  • 38.
  • 39.
  • 40. Generalizable knowledge Generalization Generalization Observation Generalization Generalization P-values and confidence intervals are used to quantify the uncertainty. They help us generalize.
  • 41.
  • 42. How is the uncertainty assessed?
  • 43. Statistical precision Statistical precision depends on: a) the variability (SD) between independent observations b) the number (n) of independent observations The standard error of an estimate (SE) = SD/√n With the same variability, a greater sample size is needed to detect a lower effect.
  • 44.
  • 45. Example: Vaccine trial Protection of pandemic vaccine: 30% ill without vaccine. Sample size for max 5% risk of false positive and 20% of false negative result. Protection Nr patients 90 % 72 80 % 94 70 % 128 60 % 180 50 % 268 40 % 428
  • 46. Example: Observational safety study Guillain-Barrés syndrome: Incidence = 1x10-5 pyears Sample size for max 5% risk of false positive and 20% of false negative result Relative risk Nr patients Nr affected 100 1 098 9 000 50 2 606 4 500 20 9 075 1 800 10 26 366 900 5 92 248 450 2 992 360 180
  • 47. Statistical precision The p-value The probability of by chance obtaining a result at least as extreme as that observed, when no effect exists. If |Diffmean/SEDiff| > 1.96 then p < 0.05 and Diffmean is considered statistically significant
  • 48. Statistical precision Confidence interval A range of values, which with specified confidence includes the estimated population parameter. Diffmean±1.96 SEDiff gives a 95% confidence interval
  • 49. P-values are usually misconstrued They do not - describe clinical relevance, because they depend on sample size - show that a difference “does not exist”, because statistical insignificance indicates absence of evidence, not evidence of absence - present the uncertainty in the magnitude of an effect or difference, because the just relate to null effect (the null hypothesis)
  • 50. Results There was no difference in BMI (p = 0.09), see Table 1. Table 1 BMI (mean ±SD) Group 1. 29.2 ±6.9 Group 2. 33.8 ±7.1
  • 51. Confidence intervals are better than p-values In contrast to p-values they do facilitate - assessment of clinical significance - show when a difference “does not exist”, because they present lower and upper limits of potential clinical effects/differences
  • 52. Results There was a difference in BMI of 4.1 (-0.3, 9.0) kg/m2, see Table 1. Table 1 BMI (mean ±SD) Group 1. 29.2 ±6.9 Group 2. 33.8 ±7.1
  • 53. P-value and confidence interval Information in p-values Information in confidence intervals [2 possibilities] [2 possibilities] p < 0.05 Statistically significant effect n.s. Inconclusive Effect 0
  • 54. P-value and confidence interval P-value Conclusion from confidence intervals Statistically and clinically significant effect p < 0.05 Statistically, but not necessarily clinically, significant effect p < 0.05 n.s. Inconclusive n.s. Neither statistically nor clinically significant effect p < 0.05 Statistically significant reversed effect Effect 0 Clinically significant effects
  • 55. Superiority vs. non-inferiority Superiority shown Superiority shown less strongly Non-inferiority not shown Superiority not shown Non-inferiority shown Superiority not shown Equivalence shown Superiority not shown Control better New agent better 0 Margin of non-inferiority or equivalence
  • 56. Science as “significant observations” P < 0.05 [There is a difference] Data NS [There is no difference]
  • 57. Science as “significant observations” P < 0.05 [There is a difference] A p-value can be meaningfully interpreted only when the hypothesis is defined a priori and when multiplicity issues are considered. Data NS [There is no difference] No, statistical insignificance indicates absence of evidence, not evidence of absence.
  • 58. Science as “significant observations” What should not be asked Is there a statistically significant difference in the studied group of patients? Data What should be asked Is there an indication of a clinically significant difference among patients in general?
  • 60. 2. The consequence of study design
  • 61. Evidence based medicine 1. Strong evidence from at least one systematic review of multiple well-designed randomized controlled trials. 2. Strong evidence from at least one properly designed randomized controlled trial of appropriate size. 3. Evidence from well-designed trials such as pseudo-randomized or non-randomized trials, cohort studies, time series or matched case-controlled studies. 4. Evidence from well-designed non-experimental studies from more than one center or research group or from case reports. 5. Opinions of respected authorities, based on clinical evidence, descriptive studies or reports of expert committees.
  • 62. Any claim coming from an observational study is most likely to be wrong 12 randomised trials have tested 52 observational claims (about the effects of vitamine B6, B12, C, D, E, beta carotene, hormone replacement therapy, folic acid and selenium). “They all confirmed no claims in the direction of the observational claim. We repeat that figure: 0 out of 52. To put it in another way, 100% of the observational claims failed to replicate. In fact, five claims (9.6%) are statistically significant in the opposite direction to the observational claim.” Stanley Young and Allan Karr, Significance, September 2011
  • 63. Even good observational research... A series of observational studies published in the Lancet and the NEJM generated and tested during the 1980s the hypothesis that AIDS was caused by the side effect of a drug (amyl nitrite). The authors of these publications also claimed to have identified the biological mechanism and urged for preventive measures. Then the virus was detected. Vandenbroucke JP and Pardoel VP. An autopsy of epidemiologic methods: the case of “poppers” in the early epidemic of the acquired immunodeficiency syndrome (AIDS). Am J Epidemiol 1989;129:455-457.
  • 64. What is the most important methodological difference between observational and experimental studies?
  • 65. Experimental vs. observational studies Experiments Bias is eliminated by design (“Block what you can, randomize what you cannot”) Statistical analysis: Focus on precision Observation Blocking and randomization is not possible. Bias must be taken into consideration in the statistical analysis. Statistical analysis: Focus on validity
  • 66.
  • 67. Experimental studies - Randomized clinical trials - Laboratory experiments
  • 68.
  • 69.
  • 70. Tests for baseline imbalance Baseline imbalance after randomization is often tested. This is not meaningful. The purpose of randomization is to avoid systematic imbalance (bias), not random errors (reduced precision). The method to avoid random baseline imbalance is to use randomization stratification.
  • 71. Multiplicity In contrast to many other forms of precision, statistical precision depends on the number of measurements performed (the number of hypotheses tested). The probability of a false positive finding increases with the number of performed tests.
  • 72. Multiplicity The risk of getting at least one false positive finding can be calculated as 1 - (1 - α)k where k is the number of performed comparisons and α the significance level (usually 0.05). Number of tests Risk of at least one false positive 1 0.05 2 0.10 10 0.40 20 0.60
  • 73.
  • 74. Multiplicity Adjustments of p-values can be made, but these reduce the type 1 error rate on the expense of the type 2 error rate, which means that a greater patient number will be needed, which in turn means higher cost. Recommendation: Avoid multiplicity adjustments. Laboratory experimenters often use Bonferroni correction to address multiplicity issues within endpoints, but hardly ever to correct for the multiplicity of endpoints. The work is therefore hypothesis generating rather than confirmatory.
  • 75. Statistical analyses Type of test Result Confirmatory Empirical support for a claim of superiority, equivalence or non-inferiority. Hypothesis A new hypothesis, which needs to be tested generating in a new hypothesis test.
  • 76. How can I avoid multiplicity adjustments? Most trials include more than 1 outcome. Define a structure or hierarchy of endpoints: primary, secondary and safety. Define primary endpoint(s) as confirmatory and secondary as hypothesis generating. No adjustment is necessary when statistical significance is required for all multiplicities or for supporting or explanatory hypothesis tests.
  • 77. Endpoints Primary The variable capable of providing the most clinically relevant evidence directly related to the primary objective of the trial Secondary Effects related to secondary objectives, measurements supporting primary endpoint(s) or hypothesis generating tests.
  • 78. Validity issues in randomized trials External validity Inclusion/exclusion criteria affects the representativity of the results (efficacy vs. effectiveness). Internal validity Some subjects withdraw, from follow up. The withdrawal may depend on treatment and on the patient's characteristics. This can bias both efficacy and effectiveness.
  • 79. Study populations Intention-to-treat Analyze all randomized subjects (ITT) principle according to randomized treatment. Full analysis set The set of subjects that is as close (FAS) as possible to the ideal implied by the ITT-principle. Per protocol The set of subjects who complied (PP) set with the protocol sufficiently to ensure that they are likely to exhibit the effects of treatment according to the underlying scientific model.
  • 80. FAS vs. PP-set FAS + no selection bias - misclassification problem (effect dilution) PP-set + no contamination problem - possible selection bias (confounding) When the FAS and PP-set lead to essentially the same conclusions, confidence in the trial is supported.
  • 81.
  • 82. Clinical trials International regulatory guidelines ICH Topic E9 - Statistical Principles for Clinical Trials EMEA Points to consider: baseline covariates - missing data - multiplicity issues - etc. and similar documents from the FDA These guidelines can all be found on the internet.
  • 83. Observational studies Main types - Cross-sectional studies - Cohort studies (prospective or historic) - Case-control studies (always retrospective)
  • 84.
  • 85. Observational studies Validity  Selection bias (systematic differences between comparison groups caused by non-random allocation of subjects)  Information bias (misclassification, measurement errors, etc.)  Confounding bias (inadequate analysis, flawed interpretation of results)
  • 86.
  • 87.
  • 88.
  • 89. Testing for confounding Screening for statistically significant effects, or stepwise regression, is often used to select covariates for inclusion in a regression model. However, confounding is a property of the sample, not of the population. Hypothesis tests have no relevance. The selection of covariates to adjust for must be based on clinical knowledge and considerations of cause and effect.
  • 90. All study designs are (more or less) problematic Observational studies - Post hoc hypothesis tests, multiple testing - Multiple modeling, protopatic bias, confounding - Recycling of data Experimental studies (laboratory experiments) - Multiple testing (Bonferroni correction within endpoints) - Small sample problems (often n=3) - Pseudoreplication and pooling of samples Experimental studies (randomized clinical trials) - External validity - No long term effects - No infrequent events
  • 91.
  • 92. Independent observations and replicates Two rats are sampled from a population with a mean (μ) of 50 and a standard deviation (σ) of 10, and ten measurements of an arbitrary outcome variable are made on each rat.
  • 94. A scientific report The idea is to try and give all the information to help others to judge the value of your contributions, not just the information that leads to judgment in one particular direction or another. Richard P. Feynman
  • 95. It is impossible to do clinical research so badly that it cannot be published “There seems to be no study too fragmented, no hypothesis too trivial, no literature citation too biased or too egotistical, no design too warped, no methodology too bungled, no presentation of results too inaccurate, no argument too circular, no conclusions too trifling or too unjustified, and no grammar and syntax too offensive for a paper to end up in print.” Drummond Rennie 1986 (editor of NEJM and JAMA)
  • 96. Changes in publication practice 1658 – first scientific journals 1858 – the IMRAD structure 1957 – the abstract 1978 – Vancouver convention (ICMJE) 1987 – the structured abstract Randomized clinical trials 1997 – Reporting guidelines (CONSORT) 1998 – Analysis guidelines (ICH) 2005 – Trial registration (Clinicaltrials.gov) Observational studies 2007 – Reporting guidelines (STROBE) 2011 – Analysis guidelines (NARA, ICRS, etc.)
  • 97.
  • 98. Clinical Trial Registration In this editorial, published simultaneously in all member journals, the International Committee of Medical Journal Editors (ICMJE) proposes comprehensive trials registration as a solution to the problem of selective awareness and announces that all 11 ICMJE member journals will adopt a trials-registration policy to promote this goal. The ICMJE member journals will require, as a condition of consideration for publication, registration in a public trials registry. Trials must register at or before the onset of patient enrollment. This policy applies to any clinical trial starting enrollment after July 1, 2005. For trials that began enrollment prior to this date, the ICMJE member journals will require registration by September 13, 2005, before considering the trial for publication. We speak only for ourselves, but we encourage editors of other biomedical journals to adopt similar policies.
  • 99.
  • 100.
  • 101.
  • 102.
  • 103.
  • 104. Thank you for your attention!