SlideShare a Scribd company logo
1 of 34
Download to read offline
Doug Ashton - Consultant
dashton@mango-solutions.com
Reproducible Environments
Doug Ashton
Doug Ashton - Consultant
dashton@mango-solutions.com
Reproducible Environments
Reproducible Packages
Overview
Reproducible
Code
Doug Ashton - Consultant
dashton@mango-solutions.com
Overview
Doug Ashton - Consultant
dashton@mango-solutions.com
Doug Ashton - Consultant
dashton@mango-solutions.com
Reproducible Code
Doug Ashton - Consultant
dashton@mango-solutions.com
Can you just re-run
those numbers for
me?
Doug Ashton - Consultant
dashton@mango-solutions.com
codedata
Doug Ashton - Consultant
dashton@mango-solutions.com
Two Warnings
• Retracted cancer research –
http://bit.ly/londonr01 (video)
• Reinhart, Rogoff, and the Excel
Error That Changed History -
http://bit.ly/londonr02
Doug Ashton - Consultant
dashton@mango-solutions.com
the solution
code
Doug Ashton - Consultant
dashton@mango-solutions.com
A Reproducible Flow
RawData
Results
Script
Doug Ashton - Consultant
dashton@mango-solutions.com
Demo
• https://github.com/dougmet/repEnvDemo
Doug Ashton - Consultant
dashton@mango-solutions.com
Input data
Output data
Manipulation
Calculation
Dependencies
Plots
Doug Ashton - Consultant
dashton@mango-solutions.com
Reproducible Research Tools
• Knitr/Sweave
• rmarkdown
• Jupyter (Ipython) Notebook
• RCloud
• git
Doug Ashton - Consultant
dashton@mango-solutions.com
Elephant picReproducible Packages
Doug Ashton - Consultant
dashton@mango-solutions.com
the new problem
Doug Ashton - Consultant
dashton@mango-solutions.com
Your Laptop
Doug Ashton - Consultant
dashton@mango-solutions.com
Doug Ashton - Consultant
dashton@mango-solutions.com
Their server
Doug Ashton - Consultant
dashton@mango-solutions.com
Package Versions
• Packrat/checkpoint/switchr
• MRAN mirror snapshots
• E.g. https://mran.revolutionanalytics.com/snapshot/2015-03-01
• Managed installation (validR)
Doug Ashton - Consultant
dashton@mango-solutions.com
Package Management Packages
Packages Pros Cons
Packrat Built into Rstudio Bulky in repo (eg git)
Needs foresight
Checkpoint Easy
Rescues old scripts
Lots of libraries
Only one date
Switchr Package manifests
Repo friendly
Not easy
Doug Ashton - Consultant
dashton@mango-solutions.com
Packrat
Doug Ashton - Consultant
dashton@mango-solutions.com
Checkpoint
Pros
• Get old scripts working
• Simple (just set a date)
checkpoint(“2014-11-05”)
• Downloads packages
from MRAN on that date
(more later)
Cons
• Could end up with lots of
packages (library for each
date)
• Doesn’t help with multiple
dates
• No GitHub
Doug Ashton - Consultant
dashton@mango-solutions.com
Switchr
• No demo
Doug Ashton - Consultant
dashton@mango-solutions.com
Reproducible Environments
Doug Ashton - Consultant
dashton@mango-solutions.com
Reproducible Environments
3. Scalability2. Managed
Environments
1. System Dependencies
Doug Ashton - Consultant
dashton@mango-solutions.com
Vagrant
• www.vagrantup.com
• Primarily for development
• Each project gets a virtual machine
Doug Ashton - Consultant
dashton@mango-solutions.com
Docker
• www.docker.com
• Primarily for deployment
• One VM (Windows/Mac)
• No VMs (Linux)
• Scalable
Doug Ashton - Consultant
dashton@mango-solutions.com
Docker Images are Hierachical
Ubuntu
14.04
R-base
R-Finance R-Spark
Project-X
R-Hadley
Project-Y
Doug Ashton - Consultant
dashton@mango-solutions.com
Docker Containers
Multiple
containers from
one image, launch
rapidly
Doug Ashton - Consultant
dashton@mango-solutions.com
A Reproducible Flow
RawData
Results
Script
Doug Ashton - Consultant
dashton@mango-solutions.com
Rocker Warning
Doug Ashton - Consultant
dashton@mango-solutions.com
Docker Demo
• https://github.com/dougmet/dockerR
FROM dougmet/r-base:3.1.2
RUN apt-get -y install libgsl0ldbl=1.16*1 libgsl0-dev=1.16*
# Install R package manifest
COPY loadPackages.R /tmp/
COPY packages.csv /tmp/
RUN Rscript /tmp/loadPackages.R
CMD ["R"]
Doug Ashton - Consultant
dashton@mango-solutions.com
The near future
• More centralised R installations
• Centralised images
• Windows containers (open container initiative)
• Data Science Workbench
• Managed image repo
• One click to open project
• RStudio integration
Doug Ashton - Consultant
dashton@mango-solutions.com
Get in touch
• GitHub
• MangoTheCat
• dougmet
• Twitter
• @dougashton
• Email
• dashton@mango-solutions.com

More Related Content

Similar to Reproducible Environments in R

Dev tools rendering & memory profiling
Dev tools rendering & memory profilingDev tools rendering & memory profiling
Dev tools rendering & memory profiling
Open Academy
 

Similar to Reproducible Environments in R (20)

Meteor
MeteorMeteor
Meteor
 
Meteor
MeteorMeteor
Meteor
 
Quasar at Vue Contributors Day - 6th June 2018
Quasar at Vue Contributors Day - 6th June 2018Quasar at Vue Contributors Day - 6th June 2018
Quasar at Vue Contributors Day - 6th June 2018
 
CD presentation march 12th, 2018
CD presentation march 12th, 2018CD presentation march 12th, 2018
CD presentation march 12th, 2018
 
When Will Drupal Die? Keynote talk from Bay Area Drupal Camp 2014
When Will Drupal Die? Keynote talk from Bay Area Drupal Camp 2014When Will Drupal Die? Keynote talk from Bay Area Drupal Camp 2014
When Will Drupal Die? Keynote talk from Bay Area Drupal Camp 2014
 
Dev tools rendering & memory profiling
Dev tools rendering & memory profilingDev tools rendering & memory profiling
Dev tools rendering & memory profiling
 
Google Chrome DevTools: Rendering & Memory profiling on Open Academy 2013
Google Chrome DevTools: Rendering & Memory profiling on Open Academy 2013Google Chrome DevTools: Rendering & Memory profiling on Open Academy 2013
Google Chrome DevTools: Rendering & Memory profiling on Open Academy 2013
 
CI-CD Jenkins, GitHub Actions, Tekton
CI-CD Jenkins, GitHub Actions, Tekton CI-CD Jenkins, GitHub Actions, Tekton
CI-CD Jenkins, GitHub Actions, Tekton
 
Good practices to design and implement IT architecture based on AWS
Good practices to design and implement IT architecture based on AWSGood practices to design and implement IT architecture based on AWS
Good practices to design and implement IT architecture based on AWS
 
Dobre praktyki projektowania architektury i wdrażania systemów IT dla chmury ...
Dobre praktyki projektowania architektury i wdrażania systemów IT dla chmury ...Dobre praktyki projektowania architektury i wdrażania systemów IT dla chmury ...
Dobre praktyki projektowania architektury i wdrażania systemów IT dla chmury ...
 
Creating Developer-Friendly Docker Containers with Chaperone
Creating Developer-Friendly Docker Containers with ChaperoneCreating Developer-Friendly Docker Containers with Chaperone
Creating Developer-Friendly Docker Containers with Chaperone
 
The "Holy Grail" of Dev/Ops
The "Holy Grail" of Dev/OpsThe "Holy Grail" of Dev/Ops
The "Holy Grail" of Dev/Ops
 
Boycott Docker
Boycott DockerBoycott Docker
Boycott Docker
 
Taboola's experience with Apache Spark (presentation @ Reversim 2014)
Taboola's experience with Apache Spark (presentation @ Reversim 2014)Taboola's experience with Apache Spark (presentation @ Reversim 2014)
Taboola's experience with Apache Spark (presentation @ Reversim 2014)
 
Speed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudSpeed up R with parallel programming in the Cloud
Speed up R with parallel programming in the Cloud
 
Chrome Dev Summit Summary 2013 part 1 - what’s hot ?
Chrome Dev Summit Summary 2013 part 1 - what’s hot ?Chrome Dev Summit Summary 2013 part 1 - what’s hot ?
Chrome Dev Summit Summary 2013 part 1 - what’s hot ?
 
Apache Spark Performance is too hard. Let's make it easier
Apache Spark Performance is too hard. Let's make it easierApache Spark Performance is too hard. Let's make it easier
Apache Spark Performance is too hard. Let's make it easier
 
Dart presentation
Dart presentationDart presentation
Dart presentation
 
Dockers and kubernetes
Dockers and kubernetesDockers and kubernetes
Dockers and kubernetes
 
Composer JSON kills make files
Composer JSON kills make filesComposer JSON kills make files
Composer JSON kills make files
 

Recently uploaded

Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
Lars Albertsson
 

Recently uploaded (20)

Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 

Reproducible Environments in R

  • 1. Doug Ashton - Consultant dashton@mango-solutions.com Reproducible Environments Doug Ashton
  • 2. Doug Ashton - Consultant dashton@mango-solutions.com Reproducible Environments Reproducible Packages Overview Reproducible Code
  • 3. Doug Ashton - Consultant dashton@mango-solutions.com Overview
  • 4. Doug Ashton - Consultant dashton@mango-solutions.com
  • 5. Doug Ashton - Consultant dashton@mango-solutions.com Reproducible Code
  • 6. Doug Ashton - Consultant dashton@mango-solutions.com Can you just re-run those numbers for me?
  • 7. Doug Ashton - Consultant dashton@mango-solutions.com codedata
  • 8. Doug Ashton - Consultant dashton@mango-solutions.com Two Warnings • Retracted cancer research – http://bit.ly/londonr01 (video) • Reinhart, Rogoff, and the Excel Error That Changed History - http://bit.ly/londonr02
  • 9. Doug Ashton - Consultant dashton@mango-solutions.com the solution code
  • 10. Doug Ashton - Consultant dashton@mango-solutions.com A Reproducible Flow RawData Results Script
  • 11. Doug Ashton - Consultant dashton@mango-solutions.com Demo • https://github.com/dougmet/repEnvDemo
  • 12. Doug Ashton - Consultant dashton@mango-solutions.com Input data Output data Manipulation Calculation Dependencies Plots
  • 13. Doug Ashton - Consultant dashton@mango-solutions.com Reproducible Research Tools • Knitr/Sweave • rmarkdown • Jupyter (Ipython) Notebook • RCloud • git
  • 14. Doug Ashton - Consultant dashton@mango-solutions.com Elephant picReproducible Packages
  • 15. Doug Ashton - Consultant dashton@mango-solutions.com the new problem
  • 16. Doug Ashton - Consultant dashton@mango-solutions.com Your Laptop
  • 17. Doug Ashton - Consultant dashton@mango-solutions.com
  • 18. Doug Ashton - Consultant dashton@mango-solutions.com Their server
  • 19. Doug Ashton - Consultant dashton@mango-solutions.com Package Versions • Packrat/checkpoint/switchr • MRAN mirror snapshots • E.g. https://mran.revolutionanalytics.com/snapshot/2015-03-01 • Managed installation (validR)
  • 20. Doug Ashton - Consultant dashton@mango-solutions.com Package Management Packages Packages Pros Cons Packrat Built into Rstudio Bulky in repo (eg git) Needs foresight Checkpoint Easy Rescues old scripts Lots of libraries Only one date Switchr Package manifests Repo friendly Not easy
  • 21. Doug Ashton - Consultant dashton@mango-solutions.com Packrat
  • 22. Doug Ashton - Consultant dashton@mango-solutions.com Checkpoint Pros • Get old scripts working • Simple (just set a date) checkpoint(“2014-11-05”) • Downloads packages from MRAN on that date (more later) Cons • Could end up with lots of packages (library for each date) • Doesn’t help with multiple dates • No GitHub
  • 23. Doug Ashton - Consultant dashton@mango-solutions.com Switchr • No demo
  • 24. Doug Ashton - Consultant dashton@mango-solutions.com Reproducible Environments
  • 25. Doug Ashton - Consultant dashton@mango-solutions.com Reproducible Environments 3. Scalability2. Managed Environments 1. System Dependencies
  • 26. Doug Ashton - Consultant dashton@mango-solutions.com Vagrant • www.vagrantup.com • Primarily for development • Each project gets a virtual machine
  • 27. Doug Ashton - Consultant dashton@mango-solutions.com Docker • www.docker.com • Primarily for deployment • One VM (Windows/Mac) • No VMs (Linux) • Scalable
  • 28. Doug Ashton - Consultant dashton@mango-solutions.com Docker Images are Hierachical Ubuntu 14.04 R-base R-Finance R-Spark Project-X R-Hadley Project-Y
  • 29. Doug Ashton - Consultant dashton@mango-solutions.com Docker Containers Multiple containers from one image, launch rapidly
  • 30. Doug Ashton - Consultant dashton@mango-solutions.com A Reproducible Flow RawData Results Script
  • 31. Doug Ashton - Consultant dashton@mango-solutions.com Rocker Warning
  • 32. Doug Ashton - Consultant dashton@mango-solutions.com Docker Demo • https://github.com/dougmet/dockerR FROM dougmet/r-base:3.1.2 RUN apt-get -y install libgsl0ldbl=1.16*1 libgsl0-dev=1.16* # Install R package manifest COPY loadPackages.R /tmp/ COPY packages.csv /tmp/ RUN Rscript /tmp/loadPackages.R CMD ["R"]
  • 33. Doug Ashton - Consultant dashton@mango-solutions.com The near future • More centralised R installations • Centralised images • Windows containers (open container initiative) • Data Science Workbench • Managed image repo • One click to open project • RStudio integration
  • 34. Doug Ashton - Consultant dashton@mango-solutions.com Get in touch • GitHub • MangoTheCat • dougmet • Twitter • @dougashton • Email • dashton@mango-solutions.com