The Road to Reproducible Computational Research

U n i v e rs i t y LO G O
Testing and Developing Tools to Promote
the Reproducibility of Computational
Research
Andrey Moskalenko
Center for Theoretical and Computational Materials Science
Daniel Wheeler | Faical Yannick P. Congo

Reproducible Research
• Main Areas:
• Computational
• Experimental

•Context of the Project
• Simulation Management
• Sumatra and CoRR
• Benchmark Phase Field Problem
• Conclusion
Table of Contents

• Context of the Project
•Simulation Management
• Conclusion
Table of Contents

Simulation Management
The GoalComputational Research Now

Current available tools
Robust
Command line
Web integration
Highly collaborative
Not suitable for
capturing execution
context
Suitable for recording
stable automated
executions
Provides log, search and
view of execution history
Capture entire
simulation context
Version environments
Collaborative
Not collaborative
with current tools
Not robust or
ubiquitous
Not suitable for log,
search and view of
history
Suitable for building
pipelines of distinct
tasks
Enables a clear
division of tasks for
non-experts
Black box design for
each section of the
pipeline
Monolithic in nature
encouraging isolated
ecosystem of tools

•Sumatra and CoRR
• Conclusion
Table of Contents

•Sumatra and CoRR
• Environment and Examples
• Conclusion
Table of Contents
•Sumatra and CoRR
• Conclusion
Table of Contents

Sumatra and CoRR
- What is it good for?
1
- What are the limitations?

Sumatra and CoRR
- What is it good for?
1
- What are the limitations?
- Autonomous
- Local and cloud storage
- Continuously recording
- Compatible
- click-and-run
2

Sumatra and CoRR
dt = 1
Equation = f()
while elapsed_time is less than desired_duration:
result1 = equation.solve(dt = dt, solver = LinearPCG)
result2 = equation.solve(dt = small_dt, solver = LinearPCG)
if result1 does not meet tolerance * result2:
decrease dt and solve again
else:
increase dt and solve again
Extract data

Environment
Workflow
 Definition
 Jupyter Notebook aka iPython Notebook
 libraries
 GitHub
 Cluster

•Benchmark Phase Field
Problem
• Conclusion
Table of Contents

Analysis – phase-field model
2 Test CoRR and Sumatra functionality
1 Performance evaluation
3 Results
1 Performance evaluation

Analysis – phase-field model
Results

Why is reproducibility a difficult task?
• Versions and updates
• Legality
• Hardware
• Python libraries and dependencies
• Time drain

•Conclusion
Table of Contents

Conclusion
2
Problem: CHiMaD benchmark problem
Solution: CoRR
1 Could you reproduce our phase-field results?
3 More work to be done in both areas

Acknowledgements
2 MML Thermodynamics and Kinetics group
1
Mentors
Daniel Wheeler, Ph.D
Faical Yannick P. Congo, Ph.D
3 Anushka Dasgupta
4 All who made NIST SURF possible

The Road to Reproducible Computational Research

More Related Content

What's hot

Similar to The Road to Reproducible Computational Research

The Road to Reproducible Computational Research

Editor's Notes