Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
A Bayesian Perspective On
The Reproducibility Project:
Psychology
Alexander Etz & Joachim Vandekerckhove
@alxetz ßMy Twit...
Purpose
•  Revisit Reproducibility Project: Psychology (RPP)
•  Compute Bayes factors
•  Account for publication bias in o...
TLDR: Conclusions first
•  75% of studies find qualitatively similar levels of evidence
in original and replication
•  64%...
TLDR: Conclusions first
•  75% of studies find qualitatively similar levels of evidence
in original and replication
•  64%...
The RPP
•  270 scientists attempt to closely replicate 100 psychology
studies
•  Use original materials (when possible)
• ...
The RPP
•  2 main criteria for grading replication
The RPP
•  2 main criteria for grading replication
•  Is the replication result statistically significant (p < .05) in
the...
The RPP
•  2 main criteria for grading replication
•  Is the replication result statistically significant (p < .05) in
the...
The RPP
•  Neither of these metrics are any good
•  (at least not as used)
•  Neither make predictions about out-of-sample...
The RPP
•  Nevertheless, .51 correlation between original &
replication effect sizes
•  Indicates at least some level of r...
−0.50
−0.25
0.00
0.25
0.50
0.75
1.00
0.00 0.25 0.50 0.75 1.00
Original Effect Size
ReplicationEffectSize
p−value
Not Signi...
What can explain the discrepancies?
Moderators
•  Two study attempts are in different contexts
•  Texas vs. California
•  Different context = different result...
Low power in original studies
•  Statistical power:
•  The frequency with which a study will yield a statistically signifi...
Low power in original studies
•  Replications planned to have minimum 80% power
•  Report average power of 92%
Publication bias
•  Most published results are “positive” findings
•  Statistically significant results
•  Most studies de...
Statistical significance filter
•  Incentive to have results that reach p < .05
•  “Statistically significant”
•  Evidenti...
Statistical significance filter
•  Studies with smaller effect size don’t reach significance
•  Get suppressed
•  Average ...
Can we account for this bias?
•  Consider publication as part of data collection process
•  This enters through likelihood...
Mitigation of publication bias
•  Remember the statistical significance filter
•  We try to build a statistical model of it
Mitigation of publication bias
•  We formally model 4 possible significance filters
•  4 models comprise overall H0
•  4 m...
Mitigation of publication bias
•  Expected distribution of test statistics that make it to the
literature.
Mitigation of publication bias
•  None of these are probably right
•  (Definitely all wrong)
•  But it is a reasonable sta...
The Bayes Factor
•  How the data shift the balance of evidence
•  Ratio of predictive success of the models
BF10 =
p(data ...
The Bayes Factor
•  H0: Null hypothesis
•  H1: Alternative hypothesis
BF10 =
p(data | H1)
p(data | H0 )
The Bayes Factor
•  BF10 > 1 means evidence favors H1
•  BF10 < 1 means evidence favors H0
•  Need to be clear what H0 and...
The Bayes Factor
•  H0: d = 0
BF10 =
p(data | H1)
p(data | H0 )
The Bayes Factor
•  H0: d = 0
•  H1: d ≠ 0
BF10 =
p(data | H1)
p(data | H0 )
The Bayes Factor
•  H0: d = 0
•  H1: d ≠ 0 (BAD)
•  Too vague
•  Doesn’t make predictions
BF10 =
p(data | H1)
p(data | H0 )
The Bayes Factor
•  H0: d = 0
•  H1: d ~ Normal(0, 1)
•  The effect is probably small
•  Almost certainly -2 < d < 2”
BF10...
Statistical evidence
•  Do independent study attempts obtain similar amounts of
evidence?
Statistical evidence
•  Do independent study attempts obtain similar amounts of
evidence?
•  Same prior distribution for b...
Interpreting evidence
•  “How convincing would these data be to a neutral
observer?”
•  1:1 prior odds for H1 vs. H0
•  50...
Interpreting evidence
•  BF > 10 is sufficiently evidential
•  10:1 posterior odds for H1 vs. H0 (or vice versa)
•  91% po...
Interpreting evidence
•  BF > 10 is sufficiently evidential
•  10:1 posterior odds for H1 vs. H0 (or vice versa)
•  91% po...
Interpreting evidence
•  It depends on context (of course)
•  You can have higher or lower standards of evidence
Interpreting evidence
•  How do p values stack up?
•  American Statistical Association:
•  “Researchers should recognize t...
Interpreting evidence
•  How do p values stack up?
•  p < .05 is weak standard
•  p = .05 corresponds to BF ≤ 2.5 (at BEST...
Face-value BFs
•  Standard Bayes factor
•  Bias free
•  Results taken at face-value
Bias-mitigated BFs
•  Bayes factor accounting for possible bias
Illustrative Bayes factors
•  Study 27
•  t(31) = 2.27, p=.03
•  Maximum BF10 = 3.4
Illustrative Bayes factors
•  Study 27
•  t(31) = 2.27, p=.03
•  Maximum BF10 = 3.4
•  Face-value BF10 = 2.9
Illustrative Bayes factors
•  Study 27
•  t(31) = 2.27, p=.03
•  Maximum BF10 = 3.4
•  Face-value BF10 = 2.9
•  Bias-mitig...
Illustrative Bayes factors
•  Study 71
•  t(373) = 4.4, p < .001
•  Maximum BF10 = ~2300
Illustrative Bayes factors
•  Study 71
•  t(373) = 4.4, p < .001
•  Maximum BF10 = ~2300
•  Face-value BF10 = 947
Illustrative Bayes factors
•  Study 71
•  t(373) = 4.4, p < .001
•  Maximum BF10 = ~2300
•  Face-value BF10 = 947
•  Bias-...
RPP Sample
•  N=72
•  All univariate tests (t test, anova w/ 1 model df, etc.)
Results
•  Original studies, face-value
•  Ignoring pub bias
•  43% obtain BF10 > 10
•  57% obtain 1/10 < BF10 < 10
•  0 o...
Results
•  Original studies, bias-corrected
•  26% obtain BF10 > 10
•  74% obtain 1/10 < BF10 < 10
•  0 obtain BF10 < 1/10
Results
•  Replication studies, face value
•  No chance for bias, no need for correction
•  21% obtain BF10 > 10
•  79% ob...
Consistency of results
•  No alarming inconsistencies
•  46 cases where both original and replication show only
weak evide...
Consistency of results
•  11 cases where original BF10 > 10, but not replication
•  7 cases where replication BF10 > 10, b...
Moderators?
•  As Laplace would say, we have no need for that
hypothesis
•  Results adequately explained by:
•  Publicatio...
Take home message
•  Recalibrate our intuitions about statistical evidence
•  Reevaluate expectations for replications
•  ...
Thank you
Thank you
@alxetz ßMy Twitter (no ‘e’ in alex)
alexanderetz.com ß My website/blog
@VandekerckhoveJ ß Joachim’s Twitter
...
Want to learn more Bayes?
•  My blog:
•  alexanderetz.com/understanding-bayes
•  “How to become a Bayesian in eight easy s...
Technical Appendix
Mitigation of publication bias
•  Model 1: No bias
•  Every study has same chance of publication
•  Regular t distribution...
Mitigation of publication bias
•  Model 2: Extreme bias
•  Only statistically significant results published
•  t distribut...
Mitigation of publication bias
•  Model 3: Constant-bias
•  Nonsignificant results published x% as often as significant re...
Mitigation of publication bias
•  Model 4: Exponential bias
•  “Marginally significant” results have a chance to be publis...
Calculate face-value BF
•  Take the likelihood:
•  Integrate with respect to prior distribution, p(δ)
tn (x |δ)
Calculate face-value BF
•  Take the likelihood:
•  Integrate with respect to prior distribution, p(δ)
•  “What is the aver...
Calculate face-value BF
•  For H1: δ ~ Normal(0 , 1)
•  For H0: δ = 0
M− = tn (x |δ = 0)
M+ = tn (x |δ)
Δ
∫ p(δ)dδ
Calculate face-value BF
•  For H1: δ ~ Normal(0 , 1)
•  For H0: δ = 0
M− = tn (x |δ = 0)
M+ = tn (x |δ)
Δ
∫ p(δ)dδ
BF10=
p...
Calculate mitigated BF
•  Start with regular t likelihood function
•  Multiply it by bias function: w = {1, 2, 3, 4}
•  Wh...
Calculate mitigated BF
•  Start with regular t likelihood function
•  Multiply it by bias function: w = {1, 2, 3, 4}
•  Wh...
Calculate mitigated BF
•  Too messy
•  Rewrite as a new function
Calculate mitigated BF
•  Too messy
•  Rewrite as a new function
•  H1: δ ~ Normal(0 , 1):
pw+ (x | n,δ,θ)∝tn (x |δ)× w(x ...
Calculate mitigated BF
•  Too messy
•  Rewrite as a new function
•  H1: δ ~ Normal(0 , 1):
•  H0: δ = 0:
pw+ (x | n,δ,θ)∝t...
Calculate mitigated BF
•  Integrate w.r.t. p(θ) and p(δ)
•  “What is the average likelihood of the data given each bias mo...
Calculate mitigated BF
•  Integrate w.r.t. p(θ) and p(δ)
•  “What is the average likelihood of the data given each bias mo...
Calculate mitigated BF
•  Integrate w.r.t. p(θ) and p(δ)
•  “What is the average likelihood of the data given each bias mo...
Calculate mitigated BF
•  Take Mw+ and Mw- and multiply by the weights of the
corresponding bias model, then sum within ea...
Calculate mitigated BF
•  Take Mw+ and Mw- and multiply by the weights of the
corresponding bias model, then sum within ea...
Calculate mitigated BF
•  Take Mw+ and Mw- and multiply by the weights of the
corresponding bias model, then sum within ea...
Calculate mitigated BF
•  Messy, so we restate as sums:
Calculate mitigated BF
•  Messy, so we restate as sums:
•  For H1:
w
∑ p(w)Mw+
Calculate mitigated BF
•  Messy, so we restate as sums:
•  For H1:
•  For H0:
w
∑ p(w)Mw+
w
∑ p(w)Mw−
Calculate mitigated BF
•  Messy, so we restate as sums:
•  For H1:
BF10 =
•  For H0:
w
∑ p(w)Mw+
w
∑ p(w)Mw−
p(data | H1) ...
Bayes rpp bristol
Bayes rpp bristol
Upcoming SlideShare
Loading in …5
×

of

Bayes rpp bristol Slide 1 Bayes rpp bristol Slide 2 Bayes rpp bristol Slide 3 Bayes rpp bristol Slide 4 Bayes rpp bristol Slide 5 Bayes rpp bristol Slide 6 Bayes rpp bristol Slide 7 Bayes rpp bristol Slide 8 Bayes rpp bristol Slide 9 Bayes rpp bristol Slide 10 Bayes rpp bristol Slide 11 Bayes rpp bristol Slide 12 Bayes rpp bristol Slide 13 Bayes rpp bristol Slide 14 Bayes rpp bristol Slide 15 Bayes rpp bristol Slide 16 Bayes rpp bristol Slide 17 Bayes rpp bristol Slide 18 Bayes rpp bristol Slide 19 Bayes rpp bristol Slide 20 Bayes rpp bristol Slide 21 Bayes rpp bristol Slide 22 Bayes rpp bristol Slide 23 Bayes rpp bristol Slide 24 Bayes rpp bristol Slide 25 Bayes rpp bristol Slide 26 Bayes rpp bristol Slide 27 Bayes rpp bristol Slide 28 Bayes rpp bristol Slide 29 Bayes rpp bristol Slide 30 Bayes rpp bristol Slide 31 Bayes rpp bristol Slide 32 Bayes rpp bristol Slide 33 Bayes rpp bristol Slide 34 Bayes rpp bristol Slide 35 Bayes rpp bristol Slide 36 Bayes rpp bristol Slide 37 Bayes rpp bristol Slide 38 Bayes rpp bristol Slide 39 Bayes rpp bristol Slide 40 Bayes rpp bristol Slide 41 Bayes rpp bristol Slide 42 Bayes rpp bristol Slide 43 Bayes rpp bristol Slide 44 Bayes rpp bristol Slide 45 Bayes rpp bristol Slide 46 Bayes rpp bristol Slide 47 Bayes rpp bristol Slide 48 Bayes rpp bristol Slide 49 Bayes rpp bristol Slide 50 Bayes rpp bristol Slide 51 Bayes rpp bristol Slide 52 Bayes rpp bristol Slide 53 Bayes rpp bristol Slide 54 Bayes rpp bristol Slide 55 Bayes rpp bristol Slide 56 Bayes rpp bristol Slide 57 Bayes rpp bristol Slide 58 Bayes rpp bristol Slide 59 Bayes rpp bristol Slide 60 Bayes rpp bristol Slide 61 Bayes rpp bristol Slide 62 Bayes rpp bristol Slide 63 Bayes rpp bristol Slide 64 Bayes rpp bristol Slide 65 Bayes rpp bristol Slide 66 Bayes rpp bristol Slide 67 Bayes rpp bristol Slide 68 Bayes rpp bristol Slide 69 Bayes rpp bristol Slide 70 Bayes rpp bristol Slide 71 Bayes rpp bristol Slide 72 Bayes rpp bristol Slide 73 Bayes rpp bristol Slide 74 Bayes rpp bristol Slide 75 Bayes rpp bristol Slide 76 Bayes rpp bristol Slide 77 Bayes rpp bristol Slide 78 Bayes rpp bristol Slide 79 Bayes rpp bristol Slide 80 Bayes rpp bristol Slide 81 Bayes rpp bristol Slide 82 Bayes rpp bristol Slide 83
Upcoming SlideShare
Bayesian Bias Correction: Critically evaluating sets of studies in the presence of publication bias
Next
Download to read offline and view in fullscreen.

2 Likes

Share

Download to read offline

Bayes rpp bristol

Download to read offline

In this talk I discuss our recent Bayesian reanalysis of the Reproducibility Project: Psychology.

The slides at the end include the technical details underlying the Bayesian model averaging method we employ.

Bayes rpp bristol

  1. 1. A Bayesian Perspective On The Reproducibility Project: Psychology Alexander Etz & Joachim Vandekerckhove @alxetz ßMy Twitter (no ‘e’ in alex) alexanderetz.com ß My website/blog
  2. 2. Purpose •  Revisit Reproducibility Project: Psychology (RPP) •  Compute Bayes factors •  Account for publication bias in original studies •  Evaluate and compare levels of statistical evidence
  3. 3. TLDR: Conclusions first •  75% of studies find qualitatively similar levels of evidence in original and replication •  64% find weak evidence (BF < 10) in both attempts •  11% of studies find strong evidence (BF > 10) in both attempts
  4. 4. TLDR: Conclusions first •  75% of studies find qualitatively similar levels of evidence in original and replication •  64% find weak evidence (BF < 10) in both attempts •  11% of studies find strong evidence (BF > 10) in both attempts •  10% find strong evidence in replication but not original •  15% find strong evidence in original but not replication
  5. 5. The RPP •  270 scientists attempt to closely replicate 100 psychology studies •  Use original materials (when possible) •  Work with original authors •  Pre-registered to avoid bias •  Analysis plan specified in advance •  Guaranteed to be published regardless of outcome
  6. 6. The RPP •  2 main criteria for grading replication
  7. 7. The RPP •  2 main criteria for grading replication •  Is the replication result statistically significant (p < .05) in the same direction as original? •  39% success rate
  8. 8. The RPP •  2 main criteria for grading replication •  Is the replication result statistically significant (p < .05) in the same direction as original? •  39% success rate •  Does the replication’s confidence interval capture original reported effect? •  47% success rate
  9. 9. The RPP •  Neither of these metrics are any good •  (at least not as used) •  Neither make predictions about out-of-sample data •  Comparing significance levels is bad •  “The difference between significant and not significant is not necessarily itself significant” •  -Gelman & Stern (2006)
  10. 10. The RPP •  Nevertheless, .51 correlation between original & replication effect sizes •  Indicates at least some level of robustness
  11. 11. −0.50 −0.25 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 Original Effect Size ReplicationEffectSize p−value Not Significant Significant Replication Power 0.6 0.7 0.8 0.9 Source: Open Science Collaboration. (2015). Estimating the reproducibility of psychological science. Science, 349(6251), aac4716.
  12. 12. What can explain the discrepancies?
  13. 13. Moderators •  Two study attempts are in different contexts •  Texas vs. California •  Different context = different results? •  Conservative vs. Liberal sample
  14. 14. Low power in original studies •  Statistical power: •  The frequency with which a study will yield a statistically significant effect in repeated sampling, assuming that the underlying effect is of a given size. •  Low powered designs undermine credibility of statistically significant results •  Button et al. (2013) •  Type M / Type S errors (Gelman & Carlin, 2014)
  15. 15. Low power in original studies •  Replications planned to have minimum 80% power •  Report average power of 92%
  16. 16. Publication bias •  Most published results are “positive” findings •  Statistically significant results •  Most studies designed to reject H0 •  Most published studies succeed •  Selective preference = bias •  “Statistical significance filter”
  17. 17. Statistical significance filter •  Incentive to have results that reach p < .05 •  “Statistically significant” •  Evidential standard •  Studies with large effect size achieve significance •  Get published
  18. 18. Statistical significance filter •  Studies with smaller effect size don’t reach significance •  Get suppressed •  Average effect size inevitably inflates •  Replication power calculations meaningless
  19. 19. Can we account for this bias? •  Consider publication as part of data collection process •  This enters through likelihood function •  Data generating process •  Sampling distribution
  20. 20. Mitigation of publication bias •  Remember the statistical significance filter •  We try to build a statistical model of it
  21. 21. Mitigation of publication bias •  We formally model 4 possible significance filters •  4 models comprise overall H0 •  4 models comprise overall H1 •  If result consistent with bias, then Bayes factor penalized •  Raise the evidence bar
  22. 22. Mitigation of publication bias •  Expected distribution of test statistics that make it to the literature.
  23. 23. Mitigation of publication bias •  None of these are probably right •  (Definitely all wrong) •  But it is a reasonable start •  Doesn’t matter really •  We’re going to mix and mash them all together •  “Bayesian Model Averaging”
  24. 24. The Bayes Factor •  How the data shift the balance of evidence •  Ratio of predictive success of the models BF10 = p(data | H1) p(data | H0 )
  25. 25. The Bayes Factor •  H0: Null hypothesis •  H1: Alternative hypothesis BF10 = p(data | H1) p(data | H0 )
  26. 26. The Bayes Factor •  BF10 > 1 means evidence favors H1 •  BF10 < 1 means evidence favors H0 •  Need to be clear what H0 and H1 represent BF10 = p(data | H1) p(data | H0 )
  27. 27. The Bayes Factor •  H0: d = 0 BF10 = p(data | H1) p(data | H0 )
  28. 28. The Bayes Factor •  H0: d = 0 •  H1: d ≠ 0 BF10 = p(data | H1) p(data | H0 )
  29. 29. The Bayes Factor •  H0: d = 0 •  H1: d ≠ 0 (BAD) •  Too vague •  Doesn’t make predictions BF10 = p(data | H1) p(data | H0 )
  30. 30. The Bayes Factor •  H0: d = 0 •  H1: d ~ Normal(0, 1) •  The effect is probably small •  Almost certainly -2 < d < 2” BF10 = p(data | H1) p(data | H0 )
  31. 31. Statistical evidence •  Do independent study attempts obtain similar amounts of evidence?
  32. 32. Statistical evidence •  Do independent study attempts obtain similar amounts of evidence? •  Same prior distribution for both attempts •  Measuring general evidential content •  We want to evaluate evidence from outsider perspective
  33. 33. Interpreting evidence •  “How convincing would these data be to a neutral observer?” •  1:1 prior odds for H1 vs. H0 •  50% prior probability for each
  34. 34. Interpreting evidence •  BF > 10 is sufficiently evidential •  10:1 posterior odds for H1 vs. H0 (or vice versa) •  91% posterior probability for H1 (or vice versa)
  35. 35. Interpreting evidence •  BF > 10 is sufficiently evidential •  10:1 posterior odds for H1 vs. H0 (or vice versa) •  91% posterior probability for H1 (or vice versa) •  BF of 3 is too weak •  3:1 posterior odds for H1 vs. H0 (or vice versa) •  Only 75% posterior probability for H1 (or vice versa)
  36. 36. Interpreting evidence •  It depends on context (of course) •  You can have higher or lower standards of evidence
  37. 37. Interpreting evidence •  How do p values stack up? •  American Statistical Association: •  “Researchers should recognize that a p-value … near 0.05 taken by itself offers only weak evidence against the null hypothesis. “
  38. 38. Interpreting evidence •  How do p values stack up? •  p < .05 is weak standard •  p = .05 corresponds to BF ≤ 2.5 (at BEST) •  p = .01 corresponds to BF ≤ 8 (at BEST)
  39. 39. Face-value BFs •  Standard Bayes factor •  Bias free •  Results taken at face-value
  40. 40. Bias-mitigated BFs •  Bayes factor accounting for possible bias
  41. 41. Illustrative Bayes factors •  Study 27 •  t(31) = 2.27, p=.03 •  Maximum BF10 = 3.4
  42. 42. Illustrative Bayes factors •  Study 27 •  t(31) = 2.27, p=.03 •  Maximum BF10 = 3.4 •  Face-value BF10 = 2.9
  43. 43. Illustrative Bayes factors •  Study 27 •  t(31) = 2.27, p=.03 •  Maximum BF10 = 3.4 •  Face-value BF10 = 2.9 •  Bias-mitigated BF10 = .81
  44. 44. Illustrative Bayes factors •  Study 71 •  t(373) = 4.4, p < .001 •  Maximum BF10 = ~2300
  45. 45. Illustrative Bayes factors •  Study 71 •  t(373) = 4.4, p < .001 •  Maximum BF10 = ~2300 •  Face-value BF10 = 947
  46. 46. Illustrative Bayes factors •  Study 71 •  t(373) = 4.4, p < .001 •  Maximum BF10 = ~2300 •  Face-value BF10 = 947 •  Bias-mitigated BF10 = 142
  47. 47. RPP Sample •  N=72 •  All univariate tests (t test, anova w/ 1 model df, etc.)
  48. 48. Results •  Original studies, face-value •  Ignoring pub bias •  43% obtain BF10 > 10 •  57% obtain 1/10 < BF10 < 10 •  0 obtain BF10 < 1/10
  49. 49. Results •  Original studies, bias-corrected •  26% obtain BF10 > 10 •  74% obtain 1/10 < BF10 < 10 •  0 obtain BF10 < 1/10
  50. 50. Results •  Replication studies, face value •  No chance for bias, no need for correction •  21% obtain BF10 > 10 •  79% obtain 1/10 < BF10 < 10 •  0 obtain BF10 < 1/10
  51. 51. Consistency of results •  No alarming inconsistencies •  46 cases where both original and replication show only weak evidence •  Only 8 cases where both show BF10 > 10
  52. 52. Consistency of results •  11 cases where original BF10 > 10, but not replication •  7 cases where replication BF10 > 10, but not original •  In every case, the study obtaining strong evidence had the larger sample size
  53. 53. Moderators? •  As Laplace would say, we have no need for that hypothesis •  Results adequately explained by: •  Publication bias in original studies •  Generally weak standards of evidence
  54. 54. Take home message •  Recalibrate our intuitions about statistical evidence •  Reevaluate expectations for replications •  Given weak evidence in original studies
  55. 55. Thank you
  56. 56. Thank you @alxetz ßMy Twitter (no ‘e’ in alex) alexanderetz.com ß My website/blog @VandekerckhoveJ ß Joachim’s Twitter joachim.cidlab.com ß Joachim’s website
  57. 57. Want to learn more Bayes? •  My blog: •  alexanderetz.com/understanding-bayes •  “How to become a Bayesian in eight easy steps” •  http://tinyurl.com/eightstepsdraft •  JASP summer workshop •  https://jasp-stats.org/ •  Amsterdam, August 22-23, 2016
  58. 58. Technical Appendix
  59. 59. Mitigation of publication bias •  Model 1: No bias •  Every study has same chance of publication •  Regular t distributions •  H0 true: Central t •  H0 false: Noncentral t •  These are used in standard BFs
  60. 60. Mitigation of publication bias •  Model 2: Extreme bias •  Only statistically significant results published •  t distributions but… •  Zero density in the middle •  Spikes in significant regions
  61. 61. Mitigation of publication bias •  Model 3: Constant-bias •  Nonsignificant results published x% as often as significant results •  t distributions but… •  Central regions downweighted •  Large spikes over significance regions
  62. 62. Mitigation of publication bias •  Model 4: Exponential bias •  “Marginally significant” results have a chance to be published •  Harder as p gets larger •  t likelihoods but… •  Spikes over significance regions •  Quick decay to zero density as (p – α) increases
  63. 63. Calculate face-value BF •  Take the likelihood: •  Integrate with respect to prior distribution, p(δ) tn (x |δ)
  64. 64. Calculate face-value BF •  Take the likelihood: •  Integrate with respect to prior distribution, p(δ) •  “What is the average likelihood of the data given this model?” •  Result is marginal likelihood, M tn (x |δ)
  65. 65. Calculate face-value BF •  For H1: δ ~ Normal(0 , 1) •  For H0: δ = 0 M− = tn (x |δ = 0) M+ = tn (x |δ) Δ ∫ p(δ)dδ
  66. 66. Calculate face-value BF •  For H1: δ ~ Normal(0 , 1) •  For H0: δ = 0 M− = tn (x |δ = 0) M+ = tn (x |δ) Δ ∫ p(δ)dδ BF10= p(data | H1) = M+ p(data | H0 ) = M−
  67. 67. Calculate mitigated BF •  Start with regular t likelihood function •  Multiply it by bias function: w = {1, 2, 3, 4} •  Where w=1 is no bias, w=2 is extreme bias, etc. tn (x |δ)
  68. 68. Calculate mitigated BF •  Start with regular t likelihood function •  Multiply it by bias function: w = {1, 2, 3, 4} •  Where w=1 is no bias, w=2 is extreme bias, etc. •  E.g., when w=3 (constant bias): tn (x |δ) tn (x |δ)× w(x |θ)
  69. 69. Calculate mitigated BF •  Too messy •  Rewrite as a new function
  70. 70. Calculate mitigated BF •  Too messy •  Rewrite as a new function •  H1: δ ~ Normal(0 , 1): pw+ (x | n,δ,θ)∝tn (x |δ)× w(x |θ)
  71. 71. Calculate mitigated BF •  Too messy •  Rewrite as a new function •  H1: δ ~ Normal(0 , 1): •  H0: δ = 0: pw+ (x | n,δ,θ)∝tn (x |δ)× w(x |θ) pw− (x | n,θ) = pw+ (x | n,δ = 0,θ)
  72. 72. Calculate mitigated BF •  Integrate w.r.t. p(θ) and p(δ) •  “What is the average likelihood of the data given each bias model?” •  p(data | bias model w):
  73. 73. Calculate mitigated BF •  Integrate w.r.t. p(θ) and p(δ) •  “What is the average likelihood of the data given each bias model?” •  p(data | bias model w): Mw+ = Δ ∫ pw+ (x | n,δ,θ) Θ ∫ p(δ)p(θ)dδdθ
  74. 74. Calculate mitigated BF •  Integrate w.r.t. p(θ) and p(δ) •  “What is the average likelihood of the data given each bias model?” •  p(data | bias model w): Mw+ = Δ ∫ pw+ (x | n,δ,θ) Θ ∫ p(δ)p(θ)dδdθ Mw− = pw− (x | n,θ) Θ ∫ p(θ)dθ
  75. 75. Calculate mitigated BF •  Take Mw+ and Mw- and multiply by the weights of the corresponding bias model, then sum within each hypothesis
  76. 76. Calculate mitigated BF •  Take Mw+ and Mw- and multiply by the weights of the corresponding bias model, then sum within each hypothesis •  For H1: p(w =1)M1+ + p(w = 2)M2+ + p(w = 3)M3+ + p(w = 4)M4+
  77. 77. Calculate mitigated BF •  Take Mw+ and Mw- and multiply by the weights of the corresponding bias model, then sum within each hypothesis •  For H1: •  For H0: p(w =1)M1+ + p(w = 2)M2+ + p(w = 3)M3+ + p(w = 4)M4+ p(w =1)M1− + p(w = 2)M2− + p(w = 3)M3− + p(w = 4)M4−
  78. 78. Calculate mitigated BF •  Messy, so we restate as sums:
  79. 79. Calculate mitigated BF •  Messy, so we restate as sums: •  For H1: w ∑ p(w)Mw+
  80. 80. Calculate mitigated BF •  Messy, so we restate as sums: •  For H1: •  For H0: w ∑ p(w)Mw+ w ∑ p(w)Mw−
  81. 81. Calculate mitigated BF •  Messy, so we restate as sums: •  For H1: BF10 = •  For H0: w ∑ p(w)Mw+ w ∑ p(w)Mw− p(data | H1) = w ∑ p(w)Mw+ p(data | H0 ) = w ∑ p(w)Mw−
  • nailest

    Apr. 30, 2016
  • tarektahrir

    Mar. 11, 2016

In this talk I discuss our recent Bayesian reanalysis of the Reproducibility Project: Psychology. The slides at the end include the technical details underlying the Bayesian model averaging method we employ.

Views

Total views

20,939

On Slideshare

0

From embeds

0

Number of embeds

19,676

Actions

Downloads

10

Shares

0

Comments

0

Likes

2

×