Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Fallibility in science: Responsible ways to handle mistakes

3,609 views

Published on

Slides from a talk at the Department of Psychology, University of Amsterdam, November 2017

Published in: Science
  • Hello! Get Your Professional Job-Winning Resume Here - Check our website! https://vk.cc/818RFv
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Fallibility in science: Responsible ways to handle mistakes

  1. 1. Fallibility in science: Responsible ways to handle mistakes Dorothy V. M. Bishop Professor of Developmental Neuropsychology University of Oxford @deevybee
  2. 2. Thought experiment #1 • PhD student, David, has run a series of studies trying to find an impact of brain stimulation on language comprehension in stroke patients • After three studies with null findings, he has changed the design in various ways and is overjoyed when the 4th study gives a significant effect • The paper is published, with David as first author and his eminent supervisor as last author, in Nature. • The university press office features the study and it is highlighted on the BBC Radio 4 Today programme. • Two weeks later, when preparing slides for a talk at Society for Neuroscience, David finds the groups were miscoded, and in fact the sham treatment group obtained higher post-training scores
  3. 3. Questions • What should David do? • If disclosed, what impact will this have on David’s career and that of his supervisor? • If undisclosed, what impact will this have on David’s career and that of his supervisor? • Could this mistake have been avoided?
  4. 4. http://prawnsandprobability.blogspot.co.uk/2013/03/rethinking-retractions.html?m=1 • I was now due to give an hour long seminar in ~3 days that focused on some completely false results. • The paper I had been writing with Mike and David was now floundering without a data set, and my contribution had been wiped out • Worst of all: I had to tell my co-authors on the original paper that our results were invalid, that we would have to retract the paper and that it was ALL MY FAULT for not checking the code well enough. Michael: Hey, are you ready for some news Richard Mann: bring it Michael: Dave reckons you only used 1/100th of the data in the .m files you sent us, rather than 1/2 as it seems you intended Basically just data from a single trial Richard Mann: ... ......... um, ok
  5. 5. https://www.statnews.com/2017/06/01/shrimp-study-error/ What did Richard Mann do? • Confessed to PI in ‘extremely drunk Skype conversation’ • Wrote apologetic letter to retract the paper • Didn’t sleep much for several months • Reanalysed the data correctly and published a paper – in the same journal His advice: • Can’t rely on reviewers to catch errors like this • Sharing code and data is best way to avoid such errors • We need a system whereby retractions don’t carry stigma
  6. 6. https://whatsinjohnsfreezer.com/2014/05/10/co-rex-ions/ • Studied growth rates in Tyrannosaurus • Amateur paleontologist Nathan Myhrvoldfound irregularities in the data • On reexamination Hutchinson agreed estimates were ‘not good enough for firm conclusions’ • Retracted all aspects of growth rates from that paper • Blogged about his experience John Hutchinson
  7. 7. “My message … is to get out in front of problems like this, as an author. Don’t wait for someone else to point it out. If you find mistakes, correct them ASAP. Especially if they (1) involve inaccurate data in the paper (in text, figures, tables, whatever), (2) would lead others to be unable to reproduce your work in any way, even if they had all your original methods and data, or (3) alter your conclusions. It is far less excruciating to do it this way then to have someone else force you to do it, which will almost inevitably involve more formality, deeper probing, exhaustion and embarrassment. And there is really no excuse that you don’t have time to do it.” https://whatsinjohnsfreezer.com/2014/05/10/co-rex-ions/ John Hutchinson
  8. 8. https://www.statnews.com/2017/05/05/dirt-award-cleaning-scientific-literature/ http://retractionwatch.com/category/by-reason-for-retraction/doing-the-right-thing/
  9. 9. http://retractionwatch.com/2017/03/27/authors-retract-honest-error-say-arent-penalized- result/#more-48973 Interviews with 14 scientists who retracted papers for honest errors between 2010-2015 • Authors who retract for honest error say they are not penalised • Indeed, may get kudos for integrity • But notes that if authors ask to correct a paper, journal often decides on retraction • Important to de-stigmatise retraction. • Usual focus is on negative examples where papers retracted for fraud, etc. ECRs need to hear about retraction for honest error and realise it is OK
  10. 10. Thought experiment #2 • A doctoral student, Helen, has run a study using auditory event-related potentials (ERP) to compare discrimination of certain sounds in people with dyslexia vs nondyslexic controls • She has published the data in PLOS One and has deposited the anonymised raw EEG files on the Dataverse public repository • Three years later, a researcher from Iran contacts her to say that he has reanalysed her EEG files and is unable to reproduce her results. He has requested her analysis scripts. He has no publication record and has very poor English. • Helen is now working on a different project and is under intense time pressure to produce publications for a fellowship proposal. She cannot find the scripts.
  11. 11. Questions • What should Helen do? • Would this kind of experience deter you from making your data open?
  12. 12. http://www.russpoldrack.org/2013/02/anatomy-of-coding-error.html ”None of us likes to admit mistakes, but it's clear that they happen often, and the only way to learn from them is to talk about them. This is why I strongly encourage my students to tell me about their mistakes and discuss them in our lab meeting.”
  13. 13. Fallibility in science: overview • Mistakes are everywhere • They are not career-destroying • Open science and collaboration can help avoid errors but they will still occur • Need to share code as well as data • Important to talk about mistakes • Correcting the record is painful and takes time, but is important for science and for scientists • If uncorrected, others may try to apply or build on erroneous work
  14. 14. Replication • How to respond when: • Someone else fails to replicate your result • You can’t replicate someone else’s study
  15. 15. Reasons for replication failure • Initial result was a false positive • Results are sensitive to contextual factors • Lack of expertise (‘flair’) of replicator • Initial results obtained using questionable research practices – p-hacking etc • Researcher used fraudulent practices If you agree to work with replicators, it demonstrates that you are genuinely interested in getting to the truth, and not fraudulent or sloppy
  16. 16. • 7 preregistered replication studies: none found predicted effects of power pose on behavioural or hormonal measures. • Dana Carney, who was first author on original power pose papers, advised on design. Has subsequently concluded power pose effect is not real. http://www.tandfonline.com/doi/full/10.1080/23743603.2017.1309876?src=recsys
  17. 17. Replication • How to respond when: • Your own study does not replicate • You can’t replicate someone else’s study http://deevybee.blogspot.co.uk/2014/08/replication-and-reputation-whose- career.html
  18. 18. Thought experiment #3 • Simon, a graduate student who works as a demonstrator, has been trying to replicate a well- known social priming effect in his undergraduate lab classes. Over three years, he has not been able to replicate the main finding. • He has submitted a paper based on all three studies to Psychological Science, who published the initial paper reporting the result, but it is rejected because of lack of novelty. • He writes a blogpost about his experiences, casting doubt on the original finding, but is then accused of being an incompetent researcher who is using social media to bully the authors of the original study
  19. 19. Questions • Was Simon’s response reasonable? • What else could he do? • What should he do now?
  20. 20. Appropriate response to finding problems in others’ work depends on two things Was it caused by: • Honest error • Questionable research practices • p-hacking • Suppressing inconvenient data • Outright fraud: data manipulation or invention Key question: Does it require: • Correction • Discussion • Retraction Should you go public with concerns , and if so, how & when?
  21. 21. Anne Weil …my first prominent publication was a note tearing down someone else’s work. That work had appeared in a major journal and caused quite a stir — but the apparent results were the product of a careless (not dishonest, just careless) mistake in the analysis. The note pointing this out was not derogatory in tone, nor was it intended to shame, but was doubtless embarrassing to the authors. Now that I am much older, a little wiser, and a little kinder (and a lot more employed, and thus less vulnerable to jerks) I would send the authors my analysis of their math first and give them the opportunity to correct. And I hope that my colleagues would give me the same consideration if (when?) I make a stupid mistake. Comment on : https://whatsinjohnsfreezer.com/2014/05/10/co-rex-ions/ Honest error
  22. 22. Appropriate response to finding problems in others’ work depends on nature of problem Concerns re research design/analysis/interpretation Usually due to ignorance rather than deliberate malpractice: • e.g. study does not have a crucial control group Key question: Does it require: • Correction • Discussion • Retraction Usually needs DISCUSSION, but how/where?
  23. 23. Concerns re research design etc PubMed Commons provides forum for post-publication peer review and provides a way of starting a discussion Commentators have to have published in a journal covered by PubMed and are not anonymous Comment should focus on the design flaw and its implications, not on the researchers PubMed Commons gives opportunity to email author to alert them to your comment and reply – though a personal message may be more effective
  24. 24. Spiro Pantazatos2016 Oct 19 01:18 a.m. Mind the distance: spatial proximity confounds tissue-tissue gene expression correlations reported in this study. This is a novel and very interesting study. However, the authors do not adequately control for spatial proximity, which, contrary to the authors’ claims in the original article, accounts entirely for high within-network strength fraction according to our recent replication/reanalysis of these same data. Furthermore, “null networks”, (i.e. contiguous clusters with center coordinates randomly placed throughout cortex), also have significantly high strength fractions, indicating that high within-network strength fraction is not related to resting-state networks identified by fMRI. Here is a link to the full technical commentary and replication/reanalysis write-up with additional supplementary discussion: http://biorxiv.org/content/early/2016/10/04/079202 And here is a link to the replication/reanalysis code on Github: https://github.com/spiropan/ABA_functional_networks The lead authors are aware of these findings and concerns (I notified them via personal email in March, 2016) and they have let me know they plan to respond. Note: Commentator is (a) highly specific; (b) provides links to reanalysis; (c) has raised concerns with authors
  25. 25. Appropriate response to finding problems in others’ work depends on nature of problem • Questionable research practices • p-hacking • Suppressing inconvenient data Key question: Does it require: • Correction • Discussion • Retraction Harder to detect; again Pubmed Commons can be useful These have been so common in our discipline that they can be normative – often recommended by editors/reviewers! “Drop those results – they aren’t interesting”
  26. 26. https://www.ncbi.nlm.nih.gov/myncbi/franck.ramus.1/comments/ …….. Similarly, with 12 dyslexic individuals, only huge correlations greater than 0.576 could be significant. Luckily this study observed a correlation of 0.588 between left V5/MT- LGN connectivity and RAN (using a one-tailed test and correcting for two tests), but not with reading comprehension. But what about the other behavioural variables, spelling and reading speed? Are they not core symptoms of dyslexia, even more so than RAN? Do they not rely on visual abilities? Were the a priori predictions so specific to RAN and reading comprehension, that correlations with spelling and reading speed were not even tested? If those predictions had been preregistered, this might be credible. Alternatively, were those correlations tested, but not taken into account in the correction for multiple tests? (not even mentioning correlations within the control group, or across the two groups) Section from comment by Franck Ramus on Draws attention to probable p-hacking but avoids personal attack on authors
  27. 27. More serious problems can be tackled via PubMed Commons See comment in PubMed Commons belowClin Sci (Lond). 2008 Feb;114(3):221-30. Normal-sodium diet compared with low-sodium diet in compensated congestive heart failure: is sodium an old enemy or a new friend? David Nunan2017 May 31 11:16 a.m. Readers may not be aware of concerns with duplicate data in this paper and another paper (Parrinello G, 2009) by the same group published in the Journal of Cardiac Failure in 2009. Both these papers were also included in a 2012 systematic review published in BMJ Open Heart which was subsequently retracted. A notice of concern was raised with the Journal of Cardiac Failure paper. No such notice has been made for this paper and neither individual papers have been retracted. https://www.ncbi.nlm.nih.gov/myncbi/david.nunan.1/comments/
  28. 28. Thought experiment #4 • A postdoctoral fellow, Susan, is conducting a meta- analysis of studies on autistic behaviours in mice with a particular genetic modification • She finds suspicious similarities between results in three papers by one research group, even though they are described as involving different animals • She emails the senior author to ask whether they were the same animals in the three studies but gets no reply
  29. 29. Questions • What should Susan do?
  30. 30. Response to suspicion of fraud • Check your facts and then check again • Look for a pattern : a single dodgy result is never enough • Discuss with author • Discuss with journal • Seek support from senior colleagues you trust • N.B. Direct confrontation: important, but not for the inexperienced or faint-hearted Good advice here: • Simonsohn (2013) Just post it: The lesson from two cases of fabricated data detected by statistics alone. Psychological Science, 24(10) 1875–1888
  31. 31. https://medium.com/@jamesheathers/the-buck-stops-nowhere-8284a57c88c9 ”Criticism isn’t measured; in fact, it is not even considered ‘service’, a catch-all term for unpaid yet necessary sideline tasks to academic life. It is not considered at all. An additional perspective is also instructive. Imagine reading the following: Ø “responsible for three corrections and two retractions of terrible work which wasted hundreds of thousands of $ / thousands of work hours” Ø “hounded Journal XYZ into upholding their stated publication standards” Ø “author of at least thirty angry letters to editors, resulting in etc. etc.” Of course, it isn’t exactly easy to measure, but that is not the point here — the point is that the above is simply unthinkable for someone inside the academic tent. These sound like the career achievements of a curmudgeon, a thug or a crank. Even to me, these points, this reads as the brag sheet of a five-year-old boy who is proud of how many blocks he can kick over, wantonly destructive and oddly specific.” James Heathers, 2017 Tackling bad science takes up a lot of time and emotional energy – for little reward
  32. 32. Need to change the incentives! • Funders already alerted to this and working to reward reproducible science rather than sexy science • ‘Bullied into Bad Science’ campaign – formed by group of early-career researchers who were fed up with being pressured to publish in Science, Nature etc. – see @LoganCorina • Need more institutional change: hiring policies to value reproducible science
  33. 33. Overview:How to approach errors in others’ work • Computational error/failure to replicate ≠ bad science • Make contact with authors to express concerns at an early stage • If no response from senior author, can raise with other authors • Do not comment on social media unless and until direct approach to authors has failed • Take advice from senior colleagues you respect • Red flags: • Defensiveness and other-blaming (though these are natural human responses) • Unwillingness to share data (though widespread!) • Failure to respond when serious concerns are raised • Avoid inflammatory language, mockery, attacks on individuals. Objective statement of facts is more effective
  34. 34. Brown & Heathers: GRIM (Granularity-Related Inconsistency of Means): mathematical methods for verifying summary statistics of published research reports in psychology. Epskamp & Nuijten (2016) R package “statcheck”: Extract statistics from articles and recompute p values Simonsohn (2013) Just post it: The lesson from two cases of fabricated data detected by statistics alone. Psychological Science, 24(10) 1875–1888 Carlisle et al (2015) Calculating the probability of random sampling for continuous variables in submitted or published randomised controlled trials. Anaesthesia 2015, 70, 848–858 Technical postscript Statistical methods for detection of error or fraud
  35. 35. In the end, being a good scientist isn’t easy, but we can try!

×