Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

sicsa-phd2016

150 views

Published on

Talk for Reproducible Research Workshop by Ian Gent, University of St Andrews, Monday June 27 2016

Published in: Education
  • Be the first to comment

  • Be the first to like this

sicsa-phd2016

  1. 1. Reproducible Research Workshop SICSA PhD Conference June 27, 2016 Ian Gent University of St Andrews http://ian.gent http://www.slideshare.net/turingfan/sicsaphd2016 Image: © Columbia Pictures
  2. 2. Joint Work with Graham Mcdonald • http://www.gla.ac.uk/schools/computing/researchstudents/grahammcdonald/ • Graham has put far more work into this session than me!
  3. 3. Two things from the title slide I want to highlight
  4. 4. Yes this really is my URL SICSA PhD Conference June 27, 2016 Ian Gent University of St Andrews http://ian.gent http://www.slideshare.net/turingfan/sicsa-phd2016 Image: © Columbia Pictures
  5. 5. And it’s all about doing the same thing over and over again SICSA PhD Conference June 27, 2016 Ian Gent University of St Andrews http://ian.gent http://www.slideshare.net/turingfan/sicsa-phd2016 Image: © Columbia Pictures
  6. 6. Outline of Talk • Part I: Reproducibility in (Computer) Science • Part II: What we are doing for the rest of the session
  7. 7. Reproducibility in Science • Officially, reproducibility is key to science • If you reproduce my experiment it’s win-win-win • You win because you have done a good thing for Science • I win because my experiment is validated • Science wins because it knows that my conclusions are valid • Computer Science is at a huge advantage • Many of our experiments are software runs • With no human intervention or automatable intervention • We can then rerun them very easily anytime we wish • Even if our experiments need major human intervention • We can automate the parts that don’t • Plus data analysis etc 7
  8. 8. Reproducibility in Science • Officially, reproducibility is key to science • If you reproduce my experiment it’s win-win-win • You win because you have done a good thing for Science • I win because my experiment is validated • Science wins because it knows that my conclusions are valid • Computer Science is at a huge advantage • Unofficially? 8
  9. 9. #overlyhonestmethods https://twitter.com/ianholmes/status/288689712636493824
  10. 10. #overlyhonestmethods (still going strong!) https://twitter.com/DRG_physics/status/745927096850087936
  11. 11. xkcd/PhD/Dilbert Compliance https://www.phdcomics.com/comics/archive.php?comicid=1569
  12. 12. Galileo’s Telescopes • Imagine if we could look through Galileo’s telescopes • And we hadn’t bothered to keep them • Or threw away the only postdoc … • This has happened in computer science • Many many times
  13. 13. Galileo’s Telescopes • SHRDLU is a famous early AI program • We have the source code • But we can’t run it! Image AI Lab MIT
  14. 14. Interlude: Can you do experiments in CS?
  15. 15. Real Experiments that helped me get my first AI Journal paper
  16. 16. Yes it was 22 years ago!
  17. 17. What has this got to do with reproducibility?
  18. 18. How I almost didn’t get an AI Journal paper • I ran experiments overnight • Since “all” instances were easy they should just take a few hours • When I came in the next morning the output was stuck • No file writes for several hours • I assumed that some file output had got lost somehow • Killed the job and deleted all the relevant files • And the files had random seeds in them • If I had just not deleted them I could have reproduced the experiments • If this effect was a one in a million event I might never have seen it again • I am extremely lucky that it occurred once or twice in a thousand events
  19. 19. This is a fun story but with a bigger message • Reproducibility is not about the worthy principle of science • Reproducibility is about being able to reproduce your own work • Or else you can lose journal papers! • And make the next paper much harder to write • And it’s about building on the great work of others • The easier it is to reproduce somebody else’s work • The easier it is to improve on their work and contribute more to science • “It's not really for the benefit of other people. Experience shows the principal beneficiary of reproducible research is you the author yourself.” Jon Claerbout
  20. 20. Part II: The rest of this session • 11.55 Group Discussions • 12.30 Report back session • 12.55 Interactive Polling • 13.05 End (but something will slip so it will be 13.10)
  21. 21. Groups and papers • Graham has pre-allocated you to groups • Mailed you about your allocation on Friday • Graham allocated a paper to each group • Papers are (very broadly) related to your research field • You should now be sitting at the table allocated to your group • there are printed copies of the papers at your table. • If you are unsure of your group raise your hand NOW • Somebody from the committee will help you
  22. 22. The Papers • Algorithms • Learning Expressive Linkage Rules using GeneticProgramming • Robert Isele, Christian Bizer • Computer/Complex Systems • Experimental demonstration of associative memory with memristive neuralnetworks • YV Pershin, M Di Ventra • Human Computer Interaction [2 groups] • I did that! Measuring users' experience of agency in their own actions • D Coyle, J Moore, PO Kristensson, P Fletcher, A Blackwell • Machine Learning [2 groups] • A Model for Learning the Semantics of Pictures • V. Lavrenko, R. Manmatha, J. Jeon • Networks • Modeling and performance analysis of Bit Torrent like peer to peer networks • D Qiu, R Srikant
  23. 23. Some Issues to Think about • We have flagged up 5 issues for you to think about • With a number of suggested questions in each one • Some of these issues may not be relevant to your paper • So don’t feel you have to discuss each one to the same depth • Please use some of your time to discuss Issue 5: General Issues • And you may have more issues • Which would be great • And you may have more questions within each issue • Which would also be great • Some of these issues may not be relevant to your paper -
  24. 24. 1. Accuracy of Reproduction • How accurately would you expect to be able to reproduce the results of the paper? • If you were to try reproduce this paper, what parts of the experiment would you expect there to be variations or differences between the original experiment and your reproduction?
  25. 25. 2. Failure of Reproduction • If your reproduction had more variations than you expected, what variants in experimental design or results would result in you concluding that the reproduction failed? • What would make you decide that the experiments are flawed enough to invalidate the conclusions?
  26. 26. 3. Difficulty and Costs • What aspects of the reported experiment would you expect to be most challenging to reproduce? • What specialist or specific resources do you need access to for reproducing the experiment? • Are these resources publicly available? • What costs are attached to doing the reproduction? • Would this be a good use of this money/time?
  27. 27. 4. Legal and Ethical • Are there legal or ethical issues around reproducing this experiment? • If so, what are they? • If not, why aren’t there any?
  28. 28. 5. General Issues • What are the main difficulties and costs, or legal and ethical issues relating to reproducibility in your field of computing science? • What practical steps are generally taken in your field of computing science to achieve reproducibility? • Why should we be striving for reproducibility in computing science? • How much should paper reviewers focus on reproducibility? • What about PhD examiners?
  29. 29. I’ve talked too long, Go! 11.55 Group Discussions 12.30 Report back session 12.55 Interactive Polling 13.05 End (but something will slip so it will be 13.10)

×