Open experiments and open-source


Published on

A discussion of why we need and why some people are resisting the sharing of experimental code in behavioural sciences (psychology, neuroscience...)

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • This talk was given by Jonathan Peirceas part of the Open Science Symposium at ECVP 2012, organised by Lee de Wit. It may be distributed freely under creative commons (CC BY 3.0)
  • See also Stodden (2009) The Legal Framework for Reproducible Scientific Research. IEEE Computing in Science & engineeringReproducible conditions:The full compendium is available on the InternetThe media components, including the original selection and arrangement of the data, are licensed under CC BY or released to the public domain under CC0The code components are licensed under one of Apache 2.0, the MIT License, or the Modified BSD license, or released to the public domain under CC0The data have been released into the public d. omain according to the Science Commons Open Data Protocol.
  • Open experiments and open-source

    1. 1. OPEN EXPERIMENTS AND OPENSOURCE SOFTWAREJonathan PeirceUniversity of Nottingham
    2. 2. OPEN-SOURCE SOFTWARE Is free Is often more feature-rich/advanced Allows us to examine/change all the code Buggy?  Young packages can be, but developers are usually very responsive to fixing bugs  Mature packages aren’t (e.g. see Firefox, Thunderbird, GIMP, Linux, Python…) Unsustainable?  Not once they reach critical mass Open-source software is good for science What about open-sourcing experiments?
    3. 3. A REPOSITORY OF OPEN-EXPERIMENTS? Goals:  Reproducibility: Rather than running your interpretation of a study from its methods section, fetch the actual experiment  Publicity: draw attention to your experiment as people browse the repository  Education: a starting point to build an experiment for new users of a piece of software One-stop location to up/download entire experiments or components of them Platform/package independent (PsychoPy, PTB, Presentation…) Easy to upload, easy to browse
    4. 4. SIMILAR REPOSITORIES MatlabCentral File Exchange (proprietary) (for data)
    5. 5. SIMILAR REPOSITORIES MatlabCentral File Exchange (proprietary) (for data) Viperlib (for demos) (computational economics)  Create a Companion website for your paper  The model code can be run directly on the site(!)
    6. 6. SIMILAR REPOSITORIES MatlabCentral File Exchange (proprietary) (for data) Viperlib (for demos) (computational economics) (new)  "The Open Science Framework (OSF) is an infrastructure for documenting, archiving, logging, sharing, and registering scientific projects. Tools are being designed to integrate open practices with a scientists daily workflow rather than appending them ex post facto."  See the OSF goals here: 
    7. 7. POTENTIAL CONCERNS People don’t want others to see their code People might run studies that they didnt actually understand Errors in studies might propagate more Why should someone else benefit from the hours I spent coding that experiment/stimulus? We don’t need this resource; we can make web pages and use code repositories (e.g. github) People will never use such a resource Someone will have to set it up and run it
    8. 8. PEOPLE DON’T WANT OTHERS TO SEE THEIRCODE Why not?  Most people write code for themselves, not for others to see  Cleaning/documenting your code takes time  Maybe you’re a little worried about someone finding a bug in your code? On the other hand  Writing neat, clear code is good; it means fewer bugs and more-reusable code for yourself!  Although we don’t like people finding our bugs, it is actually a good thing for science  Some tools provide graphical interfaces which should reduce the anxiety
    9. 9. PEOPLE MIGHT RUN STUDIES THAT THEYDIDNT UNDERSTAND How?  They might not realise some critical part of the setup (e.g a calibrated monitor)  They might make an inappropriate change or use settings that arent possible On the other hand  Should we really be setting programming ability as a hurdle to running studies?  Providing the base code (and some notes including some of the caveats) will reduce this problem  Maybe the resource should point out that code does not replace the need for good supervision/education
    10. 10. ERRORS IN STUDIES MIGHT PROPAGATE MORE How?  If a study contains a bug in code, and is re-used by another lab, the bug will tend to remain. If they re-wrote the code from scratch it would be gone On the other hand  In reality, if the latter study finds a different result to the former, it just fails to get published because we dont know why the 2 studies differ. No advantage.  If there were a bug and the code were available we would stand some chance of finding
    11. 11. WHY SHOULD SOMEONE ELSE BENEFIT? Youve put a lot of effort into your building study Why should someone else just download it and use it for free?! Let them think of their own study! On the other hand;  (Thank goodness the open-source developers dont think like that!)  You would get to benefit from other peoples work. Science benefits  You should want people to build on your studies. That is in your interest
    12. 12. WE DON’T NEED THIS RESOURCE Why not?  We could use code repositories (e.g. sourceforge, github etc) or our institutional websites But recall the goals:  Replicability  Publicity  Education Open-source repositories are mostly designed for technically very literate, which limits the contributors
    13. 13. PEOPLE WILL NEVER USE SUCH A RESOURCE Really?  Lots of do-gooders have set up data repositories, but theyre empty OK, so how would we get people to use an open- science repository?  Encourage people that it really is good for them if people can extend their study easily  Make it compulsory (e.g. via the journals)?
    14. 14. PEOPLE WILL NEVER USE SUCH A RESOURCE Really?:  Lots of do-gooders have set up data repositories, but theyre empty OK, so how would we get people to use an open- science repository?  Encourage people that it really is good for them if people can extend their study easily  Make it compulsory (e.g. via the journals)?  Provide a kite-mark, via the journals, for articles that can be fully replicated [since giving the talk I have discovered that kite-mark is a purely British concept. It refers to a non-compulsory badge, from the British Standards Institute, showing that a product meats high quality standards]
    15. 15. REPRODUCIBLE RESEARCH STANDARD Stodden (2009) Enabling reproducible research: licensing scientific innovation. International Journal of Communications Law and Policy  Potentially different levels of compliance with the standard:  Verified: has already been verified in an independent lab  Verifiable: the compendium (full set of research materials) is available to fully reproduce the study  Semi-verifiable: not all materials have been released but the description of the work should allow replication  Non-verifiable: the work requires materials or apparatus that are not typically available  “Efforts are currently under way for the RRS to be an official mark of Science Commons. This would provide an easily identifiable logo and a clear definition for each level of reproducibility.”
    16. 16. IT WILL TAKE TIME AND EFFORT TO IMPLEMENT There will be some development time to building a site There might be further time needed to manage/screen the contributions (Im too busy with PsychoPy) On the other hand;  There are open-source tools already available to build academic repositories  We might be able to piggy-back on another site  Maybe the Open Science Framework will do all we want
    17. 17. SUMMARY Open-source software has improved scientific  Productivity Open-source experiments could improve scientific  Reproducability  Education  Productivity But we need;  buy-in from the scientists (and possibly the journals)  user-friendly resources
    18. 18. OPINIONS?