SlideShare a Scribd company logo
Measuring Reproducibility in
Computer Systems Research
Emir Muñoz
National University of Ireland Galway
Christian Collberg, Todd Proebsting, Gina Moraila,
Akash Shankaran, Zuoming Shi, Alex M Warren
http://reproducibility.cs.arizona.edu/
2
Reproducibility is the ability of an entire experiment
or study to be reproduced, either by the researcher or
by someone else working independently.
DEFINITION
One of the main principles of the scientific method.
3
“Unwillingness or inability to share ones work with
fellow researchers hampers the progress of science and
leads to needless replication of work
and the publication of potentially flawed results.”
• Cliché phrases?
• 613 papers with practical orientation from:
– 8 ACM Conferences:
• ASPLOS’12, CCS’12, OOPSLA’12, OSDI’12, PLDI’12,
SIGMOD’12, SOSP’11, VLDB’12
– 5 Journals
• TACO’9, TISSEC’15, TOCS’30, TODS’37, TOPLAS’34
4
EXPERIMENT
“Our approach can be applied on ...”
“Our implementation can be found at ...”
“... we implemented out approach”
“code and data can be downloaded from our website”
5
Can a CS student build the software within 30 minutes,
including finding and installing any dependent software
and libraries, and without bothering the authors?
Image source: http://jazzadvice.com/
• [Vandewalle et at. 2009] distinguish six degrees of reproducibility:
– 5: The results can be easily reproduced by an independent
researcher with at most 15 min of user effort, requiring only
standard, freely available tools (C compiler, etc.).
– 4: The results can be easily reproduced by an independent
researcher with at most 15 min of user effort, requiring some
proprietary source packages (MATLAB, etc.).
– 3: The results can be reproduced by an independent researcher,
requiring considerable effort.
– 2: The results could be reproduced by an independent researcher,
requiring extreme effort.
– 1: The results cannot seem to be reproduced by an
independent researcher.
– 0: The results cannot be reproduced by an independent
researcher.
6
PREVIOUS EXERCISES
• [Stodden 2010] reports about 638 registrants at the NIPS
machine learning conf.
– Why we don’t share the code?
7
PREVIOUS EXERCISES
“The time it takes to clean up and document for release”
“Dealing with questions from users about the code”
“The possibility that your code may be used without citation”
“The possibility of patents, or other IP constraints”
“Competitors may get an advantage”
8
METHODOLOGY
No attempt to check the
consistency of the claims
made in the original paper.
9
METHODOLOGY
10
METHODOLOGY
Excluded
Non-reproducible
No contact
11
RESULTS
9.8%
17.4%
26.0%
34.4%
44.4%
0.0%
5.0%
10.0%
15.0%
20.0%
25.0%
30.0%
35.0%
40.0%
45.0%
50.0%
Reproducibility
12
RESULTS
• The National Science Foundation’s (NFS) Gran
Policy Manual states that:
– Investigators are expected to share with other
researchers...
– Investigators and grantee are encouraged to share
software and inventions...
– ... Responsibility that investigators and organizations
have as members of the scientific and engineering
community, to make results, data and collections
available to other researchers.
• Industry
– Papers with only authors from industry have a low
rate or reproducibility
13
RESULTS
14
Image source: www.funnyjunk.com
• Versioning Problems
• Code Will be Available Soon
• Programmer Left
• Bad Backup Practices
• Commercial Code
• Proprietary Academic Code
• Unavailable Subsystems
• Multiple Reasons
• Intellectual Property
• Research vs. Sharing
• Security and Privacy
• Poor Design
• Too Busy to Help
So, What Were Their Excuses?
15
RESULTS
Attached is the (system) source code of our algorithm. I’m not very sure whether it is
the final version of the code used in our paper, but it should be at least 99% close.
Thank you for your interest in our work. Unfortunately the current system is not mature
enough at the moment, so it’s not yet publicly available...
I am afraid that the source code was never released. The code was never intended to be
released so is not in any shape for general use.
(STUDENT) was a graduate student in our program but he left a while back so I am
responding instead...
Thanks ... Unfortunately, the server in which my implementation was stored had a disk
crash in April and three disks crashed simultaneously...
The code is owned by (COMPANY), ...is not open-source...You best bet is to reimplement
:( Sorry
...sources are not meant to be opensource..I do not have the liberty of making available
The source code at my current institution (UNIVERSITY)...
16
Most importantly, I do not have the bandwidth to help anyone
come up to speed on this stuff.
RESULTS
17
18
RESEARCH ~ COLLABORATION
• Conferences to require the code along with
every paper submitted
• Build special tools that can run reliably and
with reproducible results
• Build web sites that allow authors to make
their code available to colleagues
• Do not follow the bad habits like “publish
and forget” style of scientific research
19
RECOMMENDATIONS
20
RECOMMENDATIONS
Grammar for sharing specifications
1. Unless you have compelling reasons not to, plan to
release the code.
2. Students will leave, plan for it.
3. Create permanent email addresses.
4. Create project websites.
5. Use a source code control system.
6. Backup your code.
7. Resolve licensing issues.
8. Keep your promises.
9. Plan for longevity.
10. Avoid cool but unusual design.
11. Plan for Reproducible Releases.
21
LESSONS LEARNED
22
23
Bash code!!
Run button

Output
Visualization
24
Reproducible Research in Computational Science
Roger D. Peng
http://www.sciencemag.org/content/334/6060/1226.full
25
Rule 1:
For Every Result,
Keep Track of
How It Was Produced
Ten Simple Rules for Reproducible Computational Research
http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003285
26
Rule 2:
Avoid Manual Data
Manipulation Steps
Ten Simple Rules for Reproducible Computational Research
http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003285
27
Rule 3:
Archive the Exact
Versions of All External
Programs Used
Ten Simple Rules for Reproducible Computational Research
http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003285
28
Rule 4:
Version Control
All Custom Scripts
Ten Simple Rules for Reproducible Computational Research
http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003285
29
Rule 5:
Record All Intermediate
Results, When Possible
In Standardized Formats
Ten Simple Rules for Reproducible Computational Research
http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003285
30
Rule 6:
For Analyses That
Include Randomness,
Note Underlying
Random Seeds
Ten Simple Rules for Reproducible Computational Research
http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003285
31
Rule 7:
Always Store Raw
Data behind Plots
Ten Simple Rules for Reproducible Computational Research
http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003285
32
Rule 8:
Generate Hierarchical
Analysis Output,
Allowing Layers
of Increasing Detail
to Be Inspected
Ten Simple Rules for Reproducible Computational Research
http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003285
33
Rule 9:
Connect Textual
Statements to
Underlying Results
Ten Simple Rules for Reproducible Computational Research
http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003285
34
Rule 10:
Provide Public Access
to Scripts, Runs,
and Results
Ten Simple Rules for Reproducible Computational Research
http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003285
35
• As a discipline, we are a long way from
reproducing research that is always, and
completely, reproducible.
• To share may increase the probabilities of
citation.
• The sharing specifications will have a positive
effect on researchers’ willingness to share.
• Sharing specifications can be used as a
contract between authors and readers.
36
CONCLUSION
• Data Quality and Trustworthiness
– How close is this data to the real-world?
– Can I trust in this data?
37
HOW THIS IS RELATED TO MY PHD
Data is The New (Black) Gold
• Data Replication & Reproducibility
– http://www.sciencemag.org/site/special/data-rep/
• Getting Results from Testing by Laura Dillon
(ACM Distinguished Speakers Program)
– http://dsp.acm.org/view_lecture.cfm?lecture_id=108
• Why You Should Share Your Musical Knowledge
– http://jazzadvice.com/why-you-should-share-your-
musical-knowledge/
• Reproducible Research in Signal Processing
– http://rr.epfl.ch/17/1/VandewalleKV09.pdf
38
FURTHER LITERATURE
• RunMyCode enables scientists to openly
share the code and data that underlie their
research publications
– http://www.runmycode.org/
• Executable Papers
– http://executablepapers.com/
• CDE: Automatically create portable Linux
applications (i.e., package, deliver, run).
– http://www.pgbovine.net/cde.html
39
FURTHER LITERATURE
• VLDB Guidelines
– http://www.vldb.org/2013/experimental_reprodu
cibility.html
• Data Package Management
– http://dat-data.com/
– https://github.com/maxogden/dat
• Data Dryad
– http://datadryad.org/
40
FURTHER LITERATURE

More Related Content

Viewers also liked

The Philosophical Aspects of Data Modelling
The Philosophical Aspects of Data ModellingThe Philosophical Aspects of Data Modelling
The Philosophical Aspects of Data Modelling
Emir Muñoz
 
Learning Content Patterns from Linked Data
Learning Content Patterns from Linked DataLearning Content Patterns from Linked Data
Learning Content Patterns from Linked Data
Emir Muñoz
 
Using Linked Data to Mine RDF from Wikipedia's Tables
Using Linked Data to Mine RDF from Wikipedia's TablesUsing Linked Data to Mine RDF from Wikipedia's Tables
Using Linked Data to Mine RDF from Wikipedia's Tables
Emir Muñoz
 
μRaptor: A DOM-based system with appetite for hCard elements
μRaptor: A DOM-based system with appetite for hCard elementsμRaptor: A DOM-based system with appetite for hCard elements
μRaptor: A DOM-based system with appetite for hCard elements
Emir Muñoz
 
Open Source Music - OHM2013
Open Source Music - OHM2013Open Source Music - OHM2013
Open Source Music - OHM2013
Robert Douglass
 
Why contributing to Drupal is awesome
Why contributing to Drupal is awesomeWhy contributing to Drupal is awesome
Why contributing to Drupal is awesome
Robert Douglass
 
The Business of Drupal
The Business of DrupalThe Business of Drupal
The Business of Drupal
Robert Douglass
 
Drupal and Interactive Digital Marketing
Drupal and Interactive Digital MarketingDrupal and Interactive Digital Marketing
Drupal and Interactive Digital Marketing
Robert Douglass
 
ApacheSolr presentation from "Do it With Drupal"
ApacheSolr presentation from "Do it With Drupal"ApacheSolr presentation from "Do it With Drupal"
ApacheSolr presentation from "Do it With Drupal"
Robert Douglass
 
State-of-the-Art Drupal Search with Apache Solr
State-of-the-Art Drupal Search with Apache SolrState-of-the-Art Drupal Search with Apache Solr
State-of-the-Art Drupal Search with Apache Solr
Robert Douglass
 
Surface Care Supremacy of Harpic & Road Ahead
Surface Care Supremacy of Harpic & Road AheadSurface Care Supremacy of Harpic & Road Ahead
Surface Care Supremacy of Harpic & Road Ahead
Harshvardhan Singh Chauhan
 

Viewers also liked (11)

The Philosophical Aspects of Data Modelling
The Philosophical Aspects of Data ModellingThe Philosophical Aspects of Data Modelling
The Philosophical Aspects of Data Modelling
 
Learning Content Patterns from Linked Data
Learning Content Patterns from Linked DataLearning Content Patterns from Linked Data
Learning Content Patterns from Linked Data
 
Using Linked Data to Mine RDF from Wikipedia's Tables
Using Linked Data to Mine RDF from Wikipedia's TablesUsing Linked Data to Mine RDF from Wikipedia's Tables
Using Linked Data to Mine RDF from Wikipedia's Tables
 
μRaptor: A DOM-based system with appetite for hCard elements
μRaptor: A DOM-based system with appetite for hCard elementsμRaptor: A DOM-based system with appetite for hCard elements
μRaptor: A DOM-based system with appetite for hCard elements
 
Open Source Music - OHM2013
Open Source Music - OHM2013Open Source Music - OHM2013
Open Source Music - OHM2013
 
Why contributing to Drupal is awesome
Why contributing to Drupal is awesomeWhy contributing to Drupal is awesome
Why contributing to Drupal is awesome
 
The Business of Drupal
The Business of DrupalThe Business of Drupal
The Business of Drupal
 
Drupal and Interactive Digital Marketing
Drupal and Interactive Digital MarketingDrupal and Interactive Digital Marketing
Drupal and Interactive Digital Marketing
 
ApacheSolr presentation from "Do it With Drupal"
ApacheSolr presentation from "Do it With Drupal"ApacheSolr presentation from "Do it With Drupal"
ApacheSolr presentation from "Do it With Drupal"
 
State-of-the-Art Drupal Search with Apache Solr
State-of-the-Art Drupal Search with Apache SolrState-of-the-Art Drupal Search with Apache Solr
State-of-the-Art Drupal Search with Apache Solr
 
Surface Care Supremacy of Harpic & Road Ahead
Surface Care Supremacy of Harpic & Road AheadSurface Care Supremacy of Harpic & Road Ahead
Surface Care Supremacy of Harpic & Road Ahead
 

Similar to Reading Group 2014

Software Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceSoftware Sustainability: Better Software Better Science
Software Sustainability: Better Software Better Science
Carole Goble
 
A practical guide to practicing open science
A practical guide to practicing open scienceA practical guide to practicing open science
A practical guide to practicing open science
Krzysztof Gorgolewski
 
Making Linked Data SPARQL with the InterMine Biological Data Warehouse
Making Linked Data SPARQL with the InterMine Biological Data WarehouseMaking Linked Data SPARQL with the InterMine Biological Data Warehouse
Making Linked Data SPARQL with the InterMine Biological Data Warehouse
Justin Clark-Casey
 
A personal journey towards more reproducible networking research
A personal journey towards more reproducible networking researchA personal journey towards more reproducible networking research
A personal journey towards more reproducible networking research
Olivier Bonaventure
 
Getting Started with RNA-Seq Data Analysis
Getting Started with RNA-Seq Data AnalysisGetting Started with RNA-Seq Data Analysis
Getting Started with RNA-Seq Data Analysis
Andreas Wilm
 
Model repositories and standard formats for model reusability
Model repositories and standard formats for model reusabilityModel repositories and standard formats for model reusability
Model repositories and standard formats for model reusability
University Medicine Greifswald
 
Scott Edmunds @ Balti & Bioinformatics: New Models in Open Data Publishing
Scott Edmunds @ Balti & Bioinformatics: New Models in Open Data PublishingScott Edmunds @ Balti & Bioinformatics: New Models in Open Data Publishing
Scott Edmunds @ Balti & Bioinformatics: New Models in Open Data Publishing
GigaScience, BGI Hong Kong
 
Developing a Research Case Study
Developing a Research Case StudyDeveloping a Research Case Study
Developing a Research Case Study
Julie Goldman
 
20171003 lancaster data conversations Chue-Hong
20171003 lancaster data conversations Chue-Hong20171003 lancaster data conversations Chue-Hong
20171003 lancaster data conversations Chue-Hong
Lancaster University Library
 
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven ScienceCapturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
dgarijo
 
Standards and tools for model management in biomedical research
Standards and tools for model management in biomedical researchStandards and tools for model management in biomedical research
Standards and tools for model management in biomedical research
University Medicine Greifswald
 
Usability, Reusability and Reproducibility of Bioinformatic Applications
 Usability, Reusability and Reproducibility of Bioinformatic Applications  Usability, Reusability and Reproducibility of Bioinformatic Applications
Usability, Reusability and Reproducibility of Bioinformatic Applications
Sandra Gesing
 
Ml based detection of users anomaly activities (20th OWASP Night Tokyo, English)
Ml based detection of users anomaly activities (20th OWASP Night Tokyo, English)Ml based detection of users anomaly activities (20th OWASP Night Tokyo, English)
Ml based detection of users anomaly activities (20th OWASP Night Tokyo, English)
Yury Leonychev
 
Software Engineering Research: Leading a Double-Agent Life.
Software Engineering Research: Leading a Double-Agent Life.Software Engineering Research: Leading a Double-Agent Life.
Software Engineering Research: Leading a Double-Agent Life.
Lionel Briand
 
Results may vary: Collaborations Workshop, Oxford 2014
Results may vary: Collaborations Workshop, Oxford 2014Results may vary: Collaborations Workshop, Oxford 2014
Results may vary: Collaborations Workshop, Oxford 2014
Carole Goble
 
Data and model management in Systems Biology
Data and model management in Systems BiologyData and model management in Systems Biology
Data and model management in Systems Biology
University Medicine Greifswald
 
Reproducibility: 10 Simple Rules
Reproducibility: 10 Simple RulesReproducibility: 10 Simple Rules
Reproducibility: 10 Simple Rules
Annika Eriksson
 
A New Model for Informed Consent: The Impact of Open Science on the Responsib...
A New Model for Informed Consent: The Impact of Open Science on the Responsib...A New Model for Informed Consent: The Impact of Open Science on the Responsib...
A New Model for Informed Consent: The Impact of Open Science on the Responsib...
john wilbanks
 
Digital Science: Towards the executable paper
Digital Science: Towards the executable paperDigital Science: Towards the executable paper
Digital Science: Towards the executable paper
Jose Enrique Ruiz
 
Six Principles of Software Design to Empower Scientists
Six Principles of Software Design to Empower ScientistsSix Principles of Software Design to Empower Scientists
Six Principles of Software Design to Empower Scientists
David De Roure
 

Similar to Reading Group 2014 (20)

Software Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceSoftware Sustainability: Better Software Better Science
Software Sustainability: Better Software Better Science
 
A practical guide to practicing open science
A practical guide to practicing open scienceA practical guide to practicing open science
A practical guide to practicing open science
 
Making Linked Data SPARQL with the InterMine Biological Data Warehouse
Making Linked Data SPARQL with the InterMine Biological Data WarehouseMaking Linked Data SPARQL with the InterMine Biological Data Warehouse
Making Linked Data SPARQL with the InterMine Biological Data Warehouse
 
A personal journey towards more reproducible networking research
A personal journey towards more reproducible networking researchA personal journey towards more reproducible networking research
A personal journey towards more reproducible networking research
 
Getting Started with RNA-Seq Data Analysis
Getting Started with RNA-Seq Data AnalysisGetting Started with RNA-Seq Data Analysis
Getting Started with RNA-Seq Data Analysis
 
Model repositories and standard formats for model reusability
Model repositories and standard formats for model reusabilityModel repositories and standard formats for model reusability
Model repositories and standard formats for model reusability
 
Scott Edmunds @ Balti & Bioinformatics: New Models in Open Data Publishing
Scott Edmunds @ Balti & Bioinformatics: New Models in Open Data PublishingScott Edmunds @ Balti & Bioinformatics: New Models in Open Data Publishing
Scott Edmunds @ Balti & Bioinformatics: New Models in Open Data Publishing
 
Developing a Research Case Study
Developing a Research Case StudyDeveloping a Research Case Study
Developing a Research Case Study
 
20171003 lancaster data conversations Chue-Hong
20171003 lancaster data conversations Chue-Hong20171003 lancaster data conversations Chue-Hong
20171003 lancaster data conversations Chue-Hong
 
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven ScienceCapturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
 
Standards and tools for model management in biomedical research
Standards and tools for model management in biomedical researchStandards and tools for model management in biomedical research
Standards and tools for model management in biomedical research
 
Usability, Reusability and Reproducibility of Bioinformatic Applications
 Usability, Reusability and Reproducibility of Bioinformatic Applications  Usability, Reusability and Reproducibility of Bioinformatic Applications
Usability, Reusability and Reproducibility of Bioinformatic Applications
 
Ml based detection of users anomaly activities (20th OWASP Night Tokyo, English)
Ml based detection of users anomaly activities (20th OWASP Night Tokyo, English)Ml based detection of users anomaly activities (20th OWASP Night Tokyo, English)
Ml based detection of users anomaly activities (20th OWASP Night Tokyo, English)
 
Software Engineering Research: Leading a Double-Agent Life.
Software Engineering Research: Leading a Double-Agent Life.Software Engineering Research: Leading a Double-Agent Life.
Software Engineering Research: Leading a Double-Agent Life.
 
Results may vary: Collaborations Workshop, Oxford 2014
Results may vary: Collaborations Workshop, Oxford 2014Results may vary: Collaborations Workshop, Oxford 2014
Results may vary: Collaborations Workshop, Oxford 2014
 
Data and model management in Systems Biology
Data and model management in Systems BiologyData and model management in Systems Biology
Data and model management in Systems Biology
 
Reproducibility: 10 Simple Rules
Reproducibility: 10 Simple RulesReproducibility: 10 Simple Rules
Reproducibility: 10 Simple Rules
 
A New Model for Informed Consent: The Impact of Open Science on the Responsib...
A New Model for Informed Consent: The Impact of Open Science on the Responsib...A New Model for Informed Consent: The Impact of Open Science on the Responsib...
A New Model for Informed Consent: The Impact of Open Science on the Responsib...
 
Digital Science: Towards the executable paper
Digital Science: Towards the executable paperDigital Science: Towards the executable paper
Digital Science: Towards the executable paper
 
Six Principles of Software Design to Empower Scientists
Six Principles of Software Design to Empower ScientistsSix Principles of Software Design to Empower Scientists
Six Principles of Software Design to Empower Scientists
 

Recently uploaded

How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
Zilliz
 

Recently uploaded (20)

How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
 

Reading Group 2014

  • 1. Measuring Reproducibility in Computer Systems Research Emir Muñoz National University of Ireland Galway Christian Collberg, Todd Proebsting, Gina Moraila, Akash Shankaran, Zuoming Shi, Alex M Warren http://reproducibility.cs.arizona.edu/
  • 2. 2 Reproducibility is the ability of an entire experiment or study to be reproduced, either by the researcher or by someone else working independently. DEFINITION One of the main principles of the scientific method.
  • 3. 3 “Unwillingness or inability to share ones work with fellow researchers hampers the progress of science and leads to needless replication of work and the publication of potentially flawed results.”
  • 4. • Cliché phrases? • 613 papers with practical orientation from: – 8 ACM Conferences: • ASPLOS’12, CCS’12, OOPSLA’12, OSDI’12, PLDI’12, SIGMOD’12, SOSP’11, VLDB’12 – 5 Journals • TACO’9, TISSEC’15, TOCS’30, TODS’37, TOPLAS’34 4 EXPERIMENT “Our approach can be applied on ...” “Our implementation can be found at ...” “... we implemented out approach” “code and data can be downloaded from our website”
  • 5. 5 Can a CS student build the software within 30 minutes, including finding and installing any dependent software and libraries, and without bothering the authors? Image source: http://jazzadvice.com/
  • 6. • [Vandewalle et at. 2009] distinguish six degrees of reproducibility: – 5: The results can be easily reproduced by an independent researcher with at most 15 min of user effort, requiring only standard, freely available tools (C compiler, etc.). – 4: The results can be easily reproduced by an independent researcher with at most 15 min of user effort, requiring some proprietary source packages (MATLAB, etc.). – 3: The results can be reproduced by an independent researcher, requiring considerable effort. – 2: The results could be reproduced by an independent researcher, requiring extreme effort. – 1: The results cannot seem to be reproduced by an independent researcher. – 0: The results cannot be reproduced by an independent researcher. 6 PREVIOUS EXERCISES
  • 7. • [Stodden 2010] reports about 638 registrants at the NIPS machine learning conf. – Why we don’t share the code? 7 PREVIOUS EXERCISES “The time it takes to clean up and document for release” “Dealing with questions from users about the code” “The possibility that your code may be used without citation” “The possibility of patents, or other IP constraints” “Competitors may get an advantage”
  • 8. 8 METHODOLOGY No attempt to check the consistency of the claims made in the original paper.
  • 13. • The National Science Foundation’s (NFS) Gran Policy Manual states that: – Investigators are expected to share with other researchers... – Investigators and grantee are encouraged to share software and inventions... – ... Responsibility that investigators and organizations have as members of the scientific and engineering community, to make results, data and collections available to other researchers. • Industry – Papers with only authors from industry have a low rate or reproducibility 13 RESULTS
  • 14. 14 Image source: www.funnyjunk.com • Versioning Problems • Code Will be Available Soon • Programmer Left • Bad Backup Practices • Commercial Code • Proprietary Academic Code • Unavailable Subsystems • Multiple Reasons • Intellectual Property • Research vs. Sharing • Security and Privacy • Poor Design • Too Busy to Help So, What Were Their Excuses?
  • 15. 15 RESULTS Attached is the (system) source code of our algorithm. I’m not very sure whether it is the final version of the code used in our paper, but it should be at least 99% close. Thank you for your interest in our work. Unfortunately the current system is not mature enough at the moment, so it’s not yet publicly available... I am afraid that the source code was never released. The code was never intended to be released so is not in any shape for general use. (STUDENT) was a graduate student in our program but he left a while back so I am responding instead... Thanks ... Unfortunately, the server in which my implementation was stored had a disk crash in April and three disks crashed simultaneously... The code is owned by (COMPANY), ...is not open-source...You best bet is to reimplement :( Sorry ...sources are not meant to be opensource..I do not have the liberty of making available The source code at my current institution (UNIVERSITY)...
  • 16. 16 Most importantly, I do not have the bandwidth to help anyone come up to speed on this stuff. RESULTS
  • 17. 17
  • 19. • Conferences to require the code along with every paper submitted • Build special tools that can run reliably and with reproducible results • Build web sites that allow authors to make their code available to colleagues • Do not follow the bad habits like “publish and forget” style of scientific research 19 RECOMMENDATIONS
  • 21. 1. Unless you have compelling reasons not to, plan to release the code. 2. Students will leave, plan for it. 3. Create permanent email addresses. 4. Create project websites. 5. Use a source code control system. 6. Backup your code. 7. Resolve licensing issues. 8. Keep your promises. 9. Plan for longevity. 10. Avoid cool but unusual design. 11. Plan for Reproducible Releases. 21 LESSONS LEARNED
  • 22. 22
  • 24. 24 Reproducible Research in Computational Science Roger D. Peng http://www.sciencemag.org/content/334/6060/1226.full
  • 25. 25 Rule 1: For Every Result, Keep Track of How It Was Produced Ten Simple Rules for Reproducible Computational Research http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003285
  • 26. 26 Rule 2: Avoid Manual Data Manipulation Steps Ten Simple Rules for Reproducible Computational Research http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003285
  • 27. 27 Rule 3: Archive the Exact Versions of All External Programs Used Ten Simple Rules for Reproducible Computational Research http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003285
  • 28. 28 Rule 4: Version Control All Custom Scripts Ten Simple Rules for Reproducible Computational Research http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003285
  • 29. 29 Rule 5: Record All Intermediate Results, When Possible In Standardized Formats Ten Simple Rules for Reproducible Computational Research http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003285
  • 30. 30 Rule 6: For Analyses That Include Randomness, Note Underlying Random Seeds Ten Simple Rules for Reproducible Computational Research http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003285
  • 31. 31 Rule 7: Always Store Raw Data behind Plots Ten Simple Rules for Reproducible Computational Research http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003285
  • 32. 32 Rule 8: Generate Hierarchical Analysis Output, Allowing Layers of Increasing Detail to Be Inspected Ten Simple Rules for Reproducible Computational Research http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003285
  • 33. 33 Rule 9: Connect Textual Statements to Underlying Results Ten Simple Rules for Reproducible Computational Research http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003285
  • 34. 34 Rule 10: Provide Public Access to Scripts, Runs, and Results Ten Simple Rules for Reproducible Computational Research http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003285
  • 35. 35
  • 36. • As a discipline, we are a long way from reproducing research that is always, and completely, reproducible. • To share may increase the probabilities of citation. • The sharing specifications will have a positive effect on researchers’ willingness to share. • Sharing specifications can be used as a contract between authors and readers. 36 CONCLUSION
  • 37. • Data Quality and Trustworthiness – How close is this data to the real-world? – Can I trust in this data? 37 HOW THIS IS RELATED TO MY PHD Data is The New (Black) Gold
  • 38. • Data Replication & Reproducibility – http://www.sciencemag.org/site/special/data-rep/ • Getting Results from Testing by Laura Dillon (ACM Distinguished Speakers Program) – http://dsp.acm.org/view_lecture.cfm?lecture_id=108 • Why You Should Share Your Musical Knowledge – http://jazzadvice.com/why-you-should-share-your- musical-knowledge/ • Reproducible Research in Signal Processing – http://rr.epfl.ch/17/1/VandewalleKV09.pdf 38 FURTHER LITERATURE
  • 39. • RunMyCode enables scientists to openly share the code and data that underlie their research publications – http://www.runmycode.org/ • Executable Papers – http://executablepapers.com/ • CDE: Automatically create portable Linux applications (i.e., package, deliver, run). – http://www.pgbovine.net/cde.html 39 FURTHER LITERATURE
  • 40. • VLDB Guidelines – http://www.vldb.org/2013/experimental_reprodu cibility.html • Data Package Management – http://dat-data.com/ – https://github.com/maxogden/dat • Data Dryad – http://datadryad.org/ 40 FURTHER LITERATURE