Open experiments and open-source

OPEN EXPERIMENTS AND OPEN
SOURCE SOFTWARE
Jonathan Peirce
University of Nottingham

OPEN-SOURCE SOFTWARE
 Is free
 Is often more feature-rich/advanced
 Allows us to examine/change all the code
 Buggy?
 Young packages can be, but developers are usually
very responsive to fixing bugs
 Mature packages aren’t (e.g. see Firefox, Thunderbird,
GIMP, Linux, Python…)
 Unsustainable?
 Not once they reach critical mass
 Open-source software is good for science
 What about open-sourcing experiments?

A REPOSITORY OF OPEN-EXPERIMENTS?
 Goals:
 Reproducibility: Rather than running your interpretation of a
study from its methods section, fetch the actual experiment
 Publicity: draw attention to your experiment as people
browse the repository
 Education: a starting point to build an experiment for new
users of a piece of software

 One-stop location to up/download entire experiments or
components of them
 Platform/package independent (PsychoPy, PTB,
Presentation…)
 Easy to upload, easy to browse

SIMILAR REPOSITORIES
 MatlabCentral File Exchange (proprietary)
 Figshare.com (for data)


 Viperlib (for demos)

 RunMyCode.org (computational economics)
 Create a 'Companion website' for your paper
 The model code can be run directly on the site(!)


 Viperlib (for demos)

 RunMyCode.org (computational economics)

 OpenScienceFramework.org (new)
 "The Open Science Framework (OSF) is an
infrastructure for documenting, archiving, logging,
sharing, and registering scientific projects. Tools are
being designed to integrate open practices with a
scientist's daily workflow rather than appending them ex
post facto."
 See the OSF goals here:
 http://openscienceframework.org/project/4znZP/wiki/home

POTENTIAL CONCERNS
 People don’t want others to see their code
 People might run studies that they didn't actually
understand
 Errors in studies might propagate more

 Why should someone else benefit from the hours I
spent coding that experiment/stimulus?
 We don’t need this resource; we can make web
pages and use code repositories (e.g. github)
 People will never use such a resource

 Someone will have to set it up and run it

PEOPLE DON’T WANT OTHERS TO SEE THEIR
CODE

 Why not?
 Most people write code for themselves, not for others to
see
 Cleaning/documenting your code takes time
 Maybe you’re a little worried about someone finding a
bug in your code?

 On the other hand
 Writing neat, clear code is good; it means fewer bugs
and more-reusable code for yourself!
 Although we don’t like people finding our bugs, it is
actually a good thing for science
 Some tools provide graphical interfaces which should
reduce the anxiety

PEOPLE MIGHT RUN STUDIES THAT THEY
DIDN'T UNDERSTAND

 How?
 They might not realise some critical part of the setup
(e.g a calibrated monitor)
 They might make an inappropriate change or use
settings that aren't possible

 Should we really be setting programming ability as a
hurdle to running studies?
 Providing the base code (and some notes including
some of the caveats) will reduce this problem
 Maybe the resource should point out that code does not
replace the need for good supervision/education

ERRORS IN STUDIES MIGHT PROPAGATE MORE
 How?
 If a study contains a bug in code, and is re-used by
another lab, the bug will tend to remain. If they re-wrote
the code from scratch it would be gone

 In reality, if the latter study finds a different result to the
former, it just fails to get published because we don't
know why the 2 studies differ. No advantage.
 If there were a bug and the code were available we
would stand some chance of finding

WHY SHOULD SOMEONE ELSE BENEFIT?
 You've put a lot of effort into your building study
 Why should someone else just download it and use
it for free?! Let them think of their own study!

 On the other hand;
 (Thank goodness the open-source developers don't
think like that!)
 You would get to benefit from other people's work.
Science benefits
 You should want people to build on your studies. That is
in your interest

WE DON’T NEED THIS RESOURCE
 Why not?
 We could use code repositories (e.g. sourceforge,
github etc) or our institutional websites

 But recall the goals:
 Replicability
 Publicity
 Education

 Open-source repositories are mostly designed for
technically very literate, which limits the contributors

PEOPLE WILL NEVER USE SUCH A RESOURCE
 Really?
 Lots of 'do-gooders' have set up data repositories, but
they're empty

 OK, so how would we get people to use an open-
science repository?
 Encourage people that it really is good for them if
people can extend their study easily
 Make it compulsory (e.g. via the journals)?

PEOPLE WILL NEVER USE SUCH A RESOURCE
 Really?:
 Lots of 'do-gooders' have set up data repositories, but they're
empty

 OK, so how would we get people to use an open-
science repository?
 Encourage people that it really is good for them if people can
extend their study easily
 Make it compulsory (e.g. via the journals)?
 Provide a kite-mark, via the journals, for articles that can be
fully replicated

[since giving the talk I have discovered that 'kite-mark' is a
purely British concept. It refers to a non-compulsory badge, from
the British Standards Institute, showing that a product meats
high quality standards]

REPRODUCIBLE RESEARCH STANDARD
 Stodden (2009) Enabling reproducible research:
licensing scientific innovation. International Journal of
Communications Law and Policy
 Potentially different levels of compliance with the standard:
 Verified: has already been verified in an independent lab
 Verifiable: the compendium (full set of research materials) is
available to fully reproduce the study
 Semi-verifiable: not all materials have been released but the
description of the work should allow replication
 Non-verifiable: the work requires materials or apparatus that are
not typically available
 “Efforts are currently under way for the RRS to be an official
mark of Science Commons. This would provide an easily
identifiable logo and a clear definition for each level of
reproducibility.”

IT WILL TAKE TIME AND EFFORT TO IMPLEMENT
 There will be some development time to building a
site
 There might be further time needed to
manage/screen the contributions
 (I'm too busy with PsychoPy)

 On the other hand;
 There are open-source tools already available to build
academic repositories
 We might be able to piggy-back on another site
 Maybe the Open Science Framework will do all we want

SUMMARY
 Open-source software has improved scientific
 Productivity

 Open-source experiments could improve scientific
 Reproducability
 Education
 Productivity

 But we need;
 buy-in from the scientists (and possibly the journals)
 user-friendly resources

Open experiments and open-source

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (7)

Similar to Open experiments and open-source

Similar to Open experiments and open-source (20)

Recently uploaded

Recently uploaded (20)

Open experiments and open-source

Editor's Notes