Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Containers in Science: neuroimaging use cases

886 views

Published on

Containers and Workflow Working Group at National Cancer Institute

Published in: Science
  • Hi there! I just wanted to share a list of sites that helped me a lot during my studies: .................................................................................................................................... www.EssayWrite.best - Write an essay .................................................................................................................................... www.LitReview.xyz - Summary of books .................................................................................................................................... www.Coursework.best - Online coursework .................................................................................................................................... www.Dissertations.me - proquest dissertations .................................................................................................................................... www.ReMovie.club - Movies reviews .................................................................................................................................... www.WebSlides.vip - Best powerpoint presentations .................................................................................................................................... www.WritePaper.info - Write a research paper .................................................................................................................................... www.EddyHelp.com - Homework help online .................................................................................................................................... www.MyResumeHelp.net - Professional resume writing service .................................................................................................................................. www.HelpWriting.net - Help with writing any papers ......................................................................................................................................... Save so as not to lose
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Containers in Science: neuroimaging use cases

  1. 1. Containers in Science: neuroimaging use cases Chris Gorgolewski
  2. 2. Software in Science: current limitations Test coverage Portability Reproducibility Reusability Documentation User support
  3. 3. Software in Science: current limitations Test coverage Portability Reproducibility Reusability Documentation User support
  4. 4. Portability “I coded up my analysis on my laptop – how can I run it on that fancy cluster you mentioned?”
  5. 5. Portability “I want to send my analysis to a collaborator – how can I do it without writing an essay on software installation?”
  6. 6. Portability “Just got a new shiny laptop – how can I keep working on my analysis without having to spend hours setting up everything?”
  7. 7. Reproducibility “My paper was under review for 7 months; I need to rerun my analyses, but my laptop got stolen in the meantime.”
  8. 8. Reproducibility “I’m trying to replicate results from a paper; got the data, but software configuration details from the paper are missing.”
  9. 9. Reproducibility “I want my analysis method to work the same way for all scientists who use it.”
  10. 10. reproducibility == portability in time
  11. 11. Software Containers: quick refresh your analysis code binary dependencies configuration files environment variables data dependencies
  12. 12. Software Containers: quick refresh Everything above the kernel level captured in one convenient package.
  13. 13. Software Containers: quick refresh Same container runs on: Windows Mac Linux HPCs No need to port code.
  14. 14. Software Containers Software containers greatly improve portability and reproducibility* *within some limits
  15. 15. Container technology vs. implementation There are many implementation of software containers. We use two: Docker (for single user Windows, Mac, Linux machines) Singularity (for multi user HPCs)
  16. 16. Docker vs. Singularity Kurtzer GM, Sochat V, Bauer MW (2017) Singularity: Scientific containers for mobility of compute. PLoS ONE 12(5): e0177459.
  17. 17. Singularity workflow or https://github.com/singularityware/docker2singularity Kurtzer GM, Sochat V, Bauer MW (2017) Singularity: Scientific containers for mobility of compute. PLoS ONE 12(5): e0177459.
  18. 18. How do we use containers? Everyday custom data analysis Development and distribution of complex pipelines Aggregating collections of portable pipelines Deploying analysis pipelines in science as a service platform
  19. 19. Every day data analysis 1. Prototype analysis on a small subset of data 1. Use Docker on a laptop 2. Convert Docker image to singularity (docker2singularity) 3. Copy the image to a cluster 4. Run at scale
  20. 20. FMRIPREP http://fmriprep.readthedocs.io
  21. 21. MRIQC http://mriqc.readthedocs.io
  22. 22. Development and distribution of complex pipelines MRIQC and FMRIPREP depend on a lot of software: 1. AFNI 2. FSL 3. FreeSurfer 4. ANTs 5. Nipype 6. Nilearn 7. Etc…
  23. 23. Development and distribution of complex pipelines Setting up all of the binary dependencies is a major road block for the users.
  24. 24. Development and distribution of complex pipelines Containers solve two problems: 1. Ease of installation for users 2. Consistent environment for developers
  25. 25. Aggregating collections of portable pipelines Containers can also be used to distribute collection of processing pipelines. portability + ease of use == containers + data standards
  26. 26. BIDS Apps BIDS Apps is a natural combination of container technology and neuroimaging dataset description.
  27. 27. BIDS Apps The goal: chose a data analysis pipeline from a library and quickly run it on your data Components: 1. Input data standard: Brain Imaging Data Structure 2. Command line interface standard 3. Containers
  28. 28. Simple parallelization scheme – map/reduce Gorgolewski KJ, Alfaro-Almagro F, Auer T, Bellec P, Capotă M, Chakravarty MM, et al. (2017) BIDS apps: Improving ease of use, accessibility, and reproducibility of neuroimaging data analysis methods. PLoS Comput Biol 13(3)
  29. 29. BIDS Apps Organic growth Fewer restrictions than other software distribution schemes ( i.e. Debian) Developers are in control
  30. 30. BIDS Apps: misusing Docker Strong versioning and testing requires careful planning. Modern Continuous Integration services are essential. bids-apps.neuroimaging.io
  31. 31. Using containers in Science as a Service platform The ultimate goal: Making more data available to more researchers
  32. 32. OpenNeuro The carrot: cutting edge computationally expensive analysis pipelines with a click of a button. The “price”: the input data and the analysis results become publicly available after a grace period
  33. 33. OpenNeuro The carrot: cutting edge computationally expensive analysis pipelines with a click of a button. How? Containers!
  34. 34. Summary Containers are useful for: Individual scientists Pipeline developers Science as a Service platforms

×