Know your R usage workflow to handle reproducibility challenges

Wit Jakuczun
Wit JakuczunI solve business problems with mathematics.
Copyright (c) WLOG Solutions
Know your R usage
workflow to handle
reproducibility challenges
Budapest, 2018
Copyright (c) WLOG Solutions
Kate and Henry
Freelancer/scientist/
consultant
The Team
Corporate/
In-house team
Meet Personas
John
Student/hobbyist
Copyright (c) WLOG Solutions
They were coding in R
happily until that one
day...
Copyright (c) WLOG Solutions
https://xkcd.com/234/
Copyright (c) WLOG Solutions
John
Could not deliver R labs homework due to
package incompatibility at professors
laptop.
Copyright (c) WLOG Solutions
Kate and Henry
Missed deadlines due to problems
installing packages for their R shiny app at
Customer’s Server running
RedHat Enterprise 6.8.
Copyright (c) WLOG Solutions
The Team
Had serious issues with package versions
conflicts due to many users, many
projects,
running RedHat Enteprise machine
without internet access.
Copyright (c) WLOG Solutions
Three different stories
the same
reproducibility
problem.
Copyright (c) WLOG Solutions
What is reproducibility?
Copyright (c) WLOG Solutions
Reproducibility is the
ability to run your code repeatedly,
at different time,
using different computer,
in such way to
obtain the same outputs given the
same inputs.
Copyright (c) WLOG Solutions
Reproducibility is the
ability to run a code repeatedly,
at different time,
using different computer,
in such way to
obtain the same outputs given the
same inputs.
Copyright (c) WLOG Solutions
Reproducibility is the
ability to run your code repeatedly,
at different time,
using different computer,
in such way to
obtain the same outputs given the
same inputs.
Copyright (c) WLOG Solutions
Reproducibility is the
ability to run your code repeatedly,
at different time,
at different computer,
in such way to
obtain the same outputs given the
same inputs.
Copyright (c) WLOG Solutions
Reproducibility is the
ability to run your code repeatedly,
at different time,
using different computer,
in such way to
obtain the same outputs given the
same inputs.
Copyright (c) WLOG Solutions
Bare metal
Operating system
Solution dependencies
Code
Data
Copyright (c) WLOG Solutions
Few examples
Copyright (c) WLOG Solutions 17
forecast v7.2
- ggplot2 (>= 2.0.0)
- Rcpp (>= 0.11)
- Added gglagplot
R 3.3.1
2016-01-03 2016-09-08
forecast v6.2
- Rcpp (>= 0.11)
R 3.2.3
forecast v8.0
- ggplot2 (>= 2.0.0)
- Rcpp (>= 0.11)
- Modified defaults
for gglagplot
R 3.3.2
2017-03-01
Copyright (c) WLOG Solutions 18
Copyright (c) WLOG Solutions
Development Production
Copyright (c) WLOG Solutions
I recommend using
rocker/r-ver
Copyright (c) WLOG Solutions
When is reproducibility
important while you
program in R?
Copyright (c) WLOG Solutions
Debian/Ubuntu
RedHat/Centos
Windows
Debian/Ubuntu
RedHat/Centos
Windows
Development Production
Deploy (share) solution to production
Copyright (c) WLOG Solutions
Debian/Ubuntu
RedHat/Centos
Windows
Debian/Ubuntu
RedHat/Centos
Windows
Development Development’
Restore development environment
Copyright (c) WLOG Solutions
Three workflows
three reproducibility
solutions.
Copyright (c) WLOG Solutions
John, student/hobbyist
Dev/Production
Version
controlFamily&Friends or
Professor
MRAN
Copyright (c) WLOG Solutions
Kate and Henry, consultancy
team/freelancer/scientist
DevProduction
Continuous
integration
Version
control
Local CRAN
MRAN
On-premise
Cloud
Spark
etc.
Copyright (c) WLOG Solutions
The Team, corporate/in-house team
DevProduction
Continuous
integration
Version
control
Local CRAN
Copyright (c) WLOG Solutions
One word on Docker
Development Production
Build for
different OS
Deployment
package
. zip
Copyright (c) WLOG Solutions
Second word on Docker
Development Production
Build
Docker
image
Copyright (c) WLOG Solutions
CRAN
management
Multiple R
versions
Debian/Ubuntu
Windows
RedHat/CenOS
Docker
Jenkins
Isolated
projects
http://rsuite.io
https://github.com/WLOGSolutions/RSuite
https://www.slideshare.net/WLOGSolutions
No installation
on prod
Internetless
environments
System
requirements
Git/SVN
Binary
packages
31
Wit Jakuczun
CEO
wit.Jakuczun@wlogsolutions.com
+48 601820620
http://www.wlogsolutions.com
1 of 31

Recommended

Managing large (and small) R based solutions with R Suite by
Managing large (and small) R based solutions with R SuiteManaging large (and small) R based solutions with R Suite
Managing large (and small) R based solutions with R SuiteWit Jakuczun
609 views62 slides
Case Studies in advanced analytics with R by
Case Studies in advanced analytics with RCase Studies in advanced analytics with R
Case Studies in advanced analytics with RWit Jakuczun
1K views49 slides
Primers or Reminders? The Effects of Existing Review Comments on Code Review by
Primers or Reminders? The Effects of Existing Review Comments on Code ReviewPrimers or Reminders? The Effects of Existing Review Comments on Code Review
Primers or Reminders? The Effects of Existing Review Comments on Code ReviewDelft University of Technology
139 views19 slides
Processing malaria HTS results using KNIME: a tutorial by
Processing malaria HTS results using KNIME: a tutorialProcessing malaria HTS results using KNIME: a tutorial
Processing malaria HTS results using KNIME: a tutorialGreg Landrum
3.3K views47 slides
Let’s talk about reproducible data analysis by
Let’s talk about reproducible data analysisLet’s talk about reproducible data analysis
Let’s talk about reproducible data analysisGreg Landrum
803 views30 slides
On the Role of the GRAPH Clause in the Performance of Federated SPARQL Queries by
On the Role of the GRAPH Clause in the Performance of Federated SPARQL QueriesOn the Role of the GRAPH Clause in the Performance of Federated SPARQL Queries
On the Role of the GRAPH Clause in the Performance of Federated SPARQL QueriesDavid Chaves-Fraga
237 views10 slides

More Related Content

What's hot

NLP2API: Replication package accepted by ICSME 2018 by
NLP2API: Replication package accepted by ICSME 2018NLP2API: Replication package accepted by ICSME 2018
NLP2API: Replication package accepted by ICSME 2018Masud Rahman
36 views9 slides
Archiving Oracle Primavera project plans with software development tools by
Archiving Oracle Primavera project plans with software development toolsArchiving Oracle Primavera project plans with software development tools
Archiving Oracle Primavera project plans with software development toolsGunther Pippèrr
332 views20 slides
OpenACC Monthly Highlights: June 2020 by
OpenACC Monthly Highlights: June 2020OpenACC Monthly Highlights: June 2020
OpenACC Monthly Highlights: June 2020OpenACC
776 views14 slides
Integration of static and dynamic analysis for understanding legacy source code by
Integration of static and dynamic analysis for understanding legacy source codeIntegration of static and dynamic analysis for understanding legacy source code
Integration of static and dynamic analysis for understanding legacy source codeMichael Moser
164 views29 slides
OpenACC Highlights: 2019 Year in Review by
OpenACC Highlights: 2019 Year in ReviewOpenACC Highlights: 2019 Year in Review
OpenACC Highlights: 2019 Year in ReviewOpenACC
405 views16 slides
ACS San Diego - The RDKit: Open-source cheminformatics by
ACS San Diego - The RDKit: Open-source cheminformaticsACS San Diego - The RDKit: Open-source cheminformatics
ACS San Diego - The RDKit: Open-source cheminformaticsGreg Landrum
12.6K views23 slides

What's hot(19)

NLP2API: Replication package accepted by ICSME 2018 by Masud Rahman
NLP2API: Replication package accepted by ICSME 2018NLP2API: Replication package accepted by ICSME 2018
NLP2API: Replication package accepted by ICSME 2018
Masud Rahman36 views
Archiving Oracle Primavera project plans with software development tools by Gunther Pippèrr
Archiving Oracle Primavera project plans with software development toolsArchiving Oracle Primavera project plans with software development tools
Archiving Oracle Primavera project plans with software development tools
Gunther Pippèrr332 views
OpenACC Monthly Highlights: June 2020 by OpenACC
OpenACC Monthly Highlights: June 2020OpenACC Monthly Highlights: June 2020
OpenACC Monthly Highlights: June 2020
OpenACC776 views
Integration of static and dynamic analysis for understanding legacy source code by Michael Moser
Integration of static and dynamic analysis for understanding legacy source codeIntegration of static and dynamic analysis for understanding legacy source code
Integration of static and dynamic analysis for understanding legacy source code
Michael Moser164 views
OpenACC Highlights: 2019 Year in Review by OpenACC
OpenACC Highlights: 2019 Year in ReviewOpenACC Highlights: 2019 Year in Review
OpenACC Highlights: 2019 Year in Review
OpenACC405 views
ACS San Diego - The RDKit: Open-source cheminformatics by Greg Landrum
ACS San Diego - The RDKit: Open-source cheminformaticsACS San Diego - The RDKit: Open-source cheminformatics
ACS San Diego - The RDKit: Open-source cheminformatics
Greg Landrum12.6K views
OpenACC Monthly Highlights: June 2021 by OpenACC
OpenACC Monthly Highlights: June 2021OpenACC Monthly Highlights: June 2021
OpenACC Monthly Highlights: June 2021
OpenACC494 views
OpenACC Monthly Highlights: March 2021 by OpenACC
OpenACC Monthly Highlights: March 2021OpenACC Monthly Highlights: March 2021
OpenACC Monthly Highlights: March 2021
OpenACC474 views
Scossu gdi iiif_r+d_report_2019 by Stefano Cossu
Scossu gdi iiif_r+d_report_2019Scossu gdi iiif_r+d_report_2019
Scossu gdi iiif_r+d_report_2019
Stefano Cossu46 views
OpenACC Highlights: GTC Digital April 2020 by OpenACC
OpenACC Highlights: GTC Digital April 2020OpenACC Highlights: GTC Digital April 2020
OpenACC Highlights: GTC Digital April 2020
OpenACC723 views
OpenACC Monthly Highlights February 2019 by NVIDIA
OpenACC Monthly Highlights February 2019OpenACC Monthly Highlights February 2019
OpenACC Monthly Highlights February 2019
NVIDIA4K views
OpenACC Monthly Highlights: May 2019 by OpenACC
OpenACC Monthly Highlights: May 2019OpenACC Monthly Highlights: May 2019
OpenACC Monthly Highlights: May 2019
OpenACC1K views
NASA_EPSCoR_poster_2015 by Longyin Cui
NASA_EPSCoR_poster_2015NASA_EPSCoR_poster_2015
NASA_EPSCoR_poster_2015
Longyin Cui59 views
Quick and Dirty: Scaling Out Predictive Models Using Revolution Analytics on ... by Revolution Analytics
Quick and Dirty: Scaling Out Predictive Models Using Revolution Analytics on ...Quick and Dirty: Scaling Out Predictive Models Using Revolution Analytics on ...
Quick and Dirty: Scaling Out Predictive Models Using Revolution Analytics on ...
Raster Algebra mit Oracle Spatial und uDig by Karin Patenge
Raster Algebra mit Oracle Spatial und uDigRaster Algebra mit Oracle Spatial und uDig
Raster Algebra mit Oracle Spatial und uDig
Karin Patenge748 views
Info gdal 20150915 by GeoMedeelel
Info gdal 20150915Info gdal 20150915
Info gdal 20150915
GeoMedeelel4.7K views
Jan2015 bioinfo update_on_ftp_sr_aand_usage by GenomeInABottle
Jan2015 bioinfo update_on_ftp_sr_aand_usageJan2015 bioinfo update_on_ftp_sr_aand_usage
Jan2015 bioinfo update_on_ftp_sr_aand_usage
GenomeInABottle376 views
Beacon v2 Reference Implementation: An Overview by CINECAProject
Beacon v2 Reference Implementation: An OverviewBeacon v2 Reference Implementation: An Overview
Beacon v2 Reference Implementation: An Overview
CINECAProject148 views
167 - Productivity for proof engineering by ESEM 2014
167 - Productivity for proof engineering167 - Productivity for proof engineering
167 - Productivity for proof engineering
ESEM 2014338 views

Similar to Know your R usage workflow to handle reproducibility challenges

How to lock a Python in a cage? Managing Python environment inside an R project by
How to lock a Python in a cage?  Managing Python environment inside an R projectHow to lock a Python in a cage?  Managing Python environment inside an R project
How to lock a Python in a cage? Managing Python environment inside an R projectWLOG Solutions
577 views27 slides
Managing large scale projects in R with R Suite by
Managing large scale projects in R with R SuiteManaging large scale projects in R with R Suite
Managing large scale projects in R with R SuiteWLOG Solutions
460 views55 slides
resume by
resumeresume
resumeSrinivasa Rao M
438 views6 slides
WebRTC Live Q&A Session #5 - JavaScript Promises and WebRTC Interoperability ... by
WebRTC Live Q&A Session #5 - JavaScript Promises and WebRTC Interoperability ...WebRTC Live Q&A Session #5 - JavaScript Promises and WebRTC Interoperability ...
WebRTC Live Q&A Session #5 - JavaScript Promises and WebRTC Interoperability ...Amir Zmora
244 views42 slides
"Enabling Googley microservices with gRPC" Riga DevDays 2018 edition by
"Enabling Googley microservices with gRPC" Riga DevDays 2018 edition"Enabling Googley microservices with gRPC" Riga DevDays 2018 edition
"Enabling Googley microservices with gRPC" Riga DevDays 2018 editionAlex Borysov
693 views141 slides
SoftwareEngineer by
SoftwareEngineerSoftwareEngineer
SoftwareEngineerTodd Nguyen
94 views4 slides

Similar to Know your R usage workflow to handle reproducibility challenges(20)

How to lock a Python in a cage? Managing Python environment inside an R project by WLOG Solutions
How to lock a Python in a cage?  Managing Python environment inside an R projectHow to lock a Python in a cage?  Managing Python environment inside an R project
How to lock a Python in a cage? Managing Python environment inside an R project
WLOG Solutions577 views
Managing large scale projects in R with R Suite by WLOG Solutions
Managing large scale projects in R with R SuiteManaging large scale projects in R with R Suite
Managing large scale projects in R with R Suite
WLOG Solutions460 views
WebRTC Live Q&A Session #5 - JavaScript Promises and WebRTC Interoperability ... by Amir Zmora
WebRTC Live Q&A Session #5 - JavaScript Promises and WebRTC Interoperability ...WebRTC Live Q&A Session #5 - JavaScript Promises and WebRTC Interoperability ...
WebRTC Live Q&A Session #5 - JavaScript Promises and WebRTC Interoperability ...
Amir Zmora244 views
"Enabling Googley microservices with gRPC" Riga DevDays 2018 edition by Alex Borysov
"Enabling Googley microservices with gRPC" Riga DevDays 2018 edition"Enabling Googley microservices with gRPC" Riga DevDays 2018 edition
"Enabling Googley microservices with gRPC" Riga DevDays 2018 edition
Alex Borysov693 views
EclipseOMRBuildingBlocks4Polyglot_TURBO18 by Xiaoli Liang
EclipseOMRBuildingBlocks4Polyglot_TURBO18EclipseOMRBuildingBlocks4Polyglot_TURBO18
EclipseOMRBuildingBlocks4Polyglot_TURBO18
Xiaoli Liang34 views
It's always your fault. Poznań ADG 2016 by Przemek Jakubczyk
It's always your fault. Poznań ADG 2016It's always your fault. Poznań ADG 2016
It's always your fault. Poznań ADG 2016
Przemek Jakubczyk534 views
Reactive robotics io_t_2017 by Trayan Iliev
Reactive robotics io_t_2017Reactive robotics io_t_2017
Reactive robotics io_t_2017
Trayan Iliev536 views
Prepare to defend thyself with Blue/Green by Sonatype
Prepare to defend thyself with Blue/GreenPrepare to defend thyself with Blue/Green
Prepare to defend thyself with Blue/Green
Sonatype 268 views
All Day DevOps 2016 Fabian - Defending Thyself with Blue Green by Fab L
All Day DevOps 2016 Fabian - Defending Thyself with Blue GreenAll Day DevOps 2016 Fabian - Defending Thyself with Blue Green
All Day DevOps 2016 Fabian - Defending Thyself with Blue Green
Fab L69 views
Bdd Net Frameworks by hdgarcia
Bdd Net FrameworksBdd Net Frameworks
Bdd Net Frameworks
hdgarcia750 views
SoftwareEngineer by Todd Nguyen
SoftwareEngineerSoftwareEngineer
SoftwareEngineer
Todd Nguyen152 views
Graal and Truffle: Modularity and Separation of Concerns as Cornerstones for ... by Thomas Wuerthinger
Graal and Truffle: Modularity and Separation of Concerns as Cornerstones for ...Graal and Truffle: Modularity and Separation of Concerns as Cornerstones for ...
Graal and Truffle: Modularity and Separation of Concerns as Cornerstones for ...
Thomas Wuerthinger3.3K views
Computing Without Computers - Oct08 by Ian Page
Computing Without Computers - Oct08Computing Without Computers - Oct08
Computing Without Computers - Oct08
Ian Page471 views

More from Wit Jakuczun

recommendation = optimization(prediction) by
recommendation = optimization(prediction)recommendation = optimization(prediction)
recommendation = optimization(prediction)Wit Jakuczun
619 views28 slides
Always Be Deploying. How to make R great for machine learning in (not only) E... by
Always Be Deploying. How to make R great for machine learning in (not only) E...Always Be Deploying. How to make R great for machine learning in (not only) E...
Always Be Deploying. How to make R great for machine learning in (not only) E...Wit Jakuczun
380 views47 slides
Driving your marketing automation with multi-armed bandits in real time by
Driving your marketing automation with multi-armed bandits in real timeDriving your marketing automation with multi-armed bandits in real time
Driving your marketing automation with multi-armed bandits in real timeWit Jakuczun
549 views52 slides
Large scale machine learning projects with r suite by
Large scale machine learning projects with r suiteLarge scale machine learning projects with r suite
Large scale machine learning projects with r suiteWit Jakuczun
700 views121 slides
20170928 why r_r jako główna platforma do zaawansowanej analityki w enterprise by
20170928 why r_r jako główna platforma do zaawansowanej analityki w enterprise20170928 why r_r jako główna platforma do zaawansowanej analityki w enterprise
20170928 why r_r jako główna platforma do zaawansowanej analityki w enterpriseWit Jakuczun
249 views36 slides
Wit jakuczun dss_conf_2017_jak_wdrazac_r_w_enterprise by
Wit jakuczun dss_conf_2017_jak_wdrazac_r_w_enterpriseWit jakuczun dss_conf_2017_jak_wdrazac_r_w_enterprise
Wit jakuczun dss_conf_2017_jak_wdrazac_r_w_enterpriseWit Jakuczun
255 views38 slides

More from Wit Jakuczun(12)

recommendation = optimization(prediction) by Wit Jakuczun
recommendation = optimization(prediction)recommendation = optimization(prediction)
recommendation = optimization(prediction)
Wit Jakuczun619 views
Always Be Deploying. How to make R great for machine learning in (not only) E... by Wit Jakuczun
Always Be Deploying. How to make R great for machine learning in (not only) E...Always Be Deploying. How to make R great for machine learning in (not only) E...
Always Be Deploying. How to make R great for machine learning in (not only) E...
Wit Jakuczun380 views
Driving your marketing automation with multi-armed bandits in real time by Wit Jakuczun
Driving your marketing automation with multi-armed bandits in real timeDriving your marketing automation with multi-armed bandits in real time
Driving your marketing automation with multi-armed bandits in real time
Wit Jakuczun549 views
Large scale machine learning projects with r suite by Wit Jakuczun
Large scale machine learning projects with r suiteLarge scale machine learning projects with r suite
Large scale machine learning projects with r suite
Wit Jakuczun700 views
20170928 why r_r jako główna platforma do zaawansowanej analityki w enterprise by Wit Jakuczun
20170928 why r_r jako główna platforma do zaawansowanej analityki w enterprise20170928 why r_r jako główna platforma do zaawansowanej analityki w enterprise
20170928 why r_r jako główna platforma do zaawansowanej analityki w enterprise
Wit Jakuczun249 views
Wit jakuczun dss_conf_2017_jak_wdrazac_r_w_enterprise by Wit Jakuczun
Wit jakuczun dss_conf_2017_jak_wdrazac_r_w_enterpriseWit jakuczun dss_conf_2017_jak_wdrazac_r_w_enterprise
Wit jakuczun dss_conf_2017_jak_wdrazac_r_w_enterprise
Wit Jakuczun255 views
Bringing the Power of LocalSolver to R: a Real-Life Case-Study by Wit Jakuczun
Bringing the Power of LocalSolver to R: a Real-Life Case-StudyBringing the Power of LocalSolver to R: a Real-Life Case-Study
Bringing the Power of LocalSolver to R: a Real-Life Case-Study
Wit Jakuczun322 views
ANALYTICS WITHOUT LOSS OF GENERALITY by Wit Jakuczun
ANALYTICS WITHOUT LOSS OF GENERALITYANALYTICS WITHOUT LOSS OF GENERALITY
ANALYTICS WITHOUT LOSS OF GENERALITY
Wit Jakuczun320 views
Showcase: on segmentation importance for marketing campaign in retail using R... by Wit Jakuczun
Showcase: on segmentation importance for marketing campaign in retail using R...Showcase: on segmentation importance for marketing campaign in retail using R...
Showcase: on segmentation importance for marketing campaign in retail using R...
Wit Jakuczun589 views
20150521 ser protecto_r_final by Wit Jakuczun
20150521 ser protecto_r_final20150521 ser protecto_r_final
20150521 ser protecto_r_final
Wit Jakuczun264 views
Rozwiązywanie problemów optymalizacyjnych (z przykładem w R) by Wit Jakuczun
Rozwiązywanie problemów optymalizacyjnych (z przykładem w R)Rozwiązywanie problemów optymalizacyjnych (z przykładem w R)
Rozwiązywanie problemów optymalizacyjnych (z przykładem w R)
Wit Jakuczun1.3K views
R+H2O - idealny tandem do analityki predykcyjnej? by Wit Jakuczun
R+H2O - idealny tandem do analityki predykcyjnej?R+H2O - idealny tandem do analityki predykcyjnej?
R+H2O - idealny tandem do analityki predykcyjnej?
Wit Jakuczun579 views

Recently uploaded

Short Story Assignment by Kelly Nguyen by
Short Story Assignment by Kelly NguyenShort Story Assignment by Kelly Nguyen
Short Story Assignment by Kelly Nguyenkellynguyen01
19 views17 slides
Advanced_Recommendation_Systems_Presentation.pptx by
Advanced_Recommendation_Systems_Presentation.pptxAdvanced_Recommendation_Systems_Presentation.pptx
Advanced_Recommendation_Systems_Presentation.pptxneeharikasingh29
5 views9 slides
[DSC Europe 23] Stefan Mrsic_Goran Savic - Evolving Technology Excellence.pptx by
[DSC Europe 23] Stefan Mrsic_Goran Savic - Evolving Technology Excellence.pptx[DSC Europe 23] Stefan Mrsic_Goran Savic - Evolving Technology Excellence.pptx
[DSC Europe 23] Stefan Mrsic_Goran Savic - Evolving Technology Excellence.pptxDataScienceConferenc1
5 views16 slides
CRM stick or twist.pptx by
CRM stick or twist.pptxCRM stick or twist.pptx
CRM stick or twist.pptxinfo828217
10 views16 slides
SAP-TCodes.pdf by
SAP-TCodes.pdfSAP-TCodes.pdf
SAP-TCodes.pdfmustafaghulam8181
10 views285 slides
[DSC Europe 23] Aleksandar Tomcic - Adversarial Attacks by
[DSC Europe 23] Aleksandar Tomcic - Adversarial Attacks[DSC Europe 23] Aleksandar Tomcic - Adversarial Attacks
[DSC Europe 23] Aleksandar Tomcic - Adversarial AttacksDataScienceConferenc1
5 views20 slides

Recently uploaded(20)

Short Story Assignment by Kelly Nguyen by kellynguyen01
Short Story Assignment by Kelly NguyenShort Story Assignment by Kelly Nguyen
Short Story Assignment by Kelly Nguyen
kellynguyen0119 views
Advanced_Recommendation_Systems_Presentation.pptx by neeharikasingh29
Advanced_Recommendation_Systems_Presentation.pptxAdvanced_Recommendation_Systems_Presentation.pptx
Advanced_Recommendation_Systems_Presentation.pptx
[DSC Europe 23] Stefan Mrsic_Goran Savic - Evolving Technology Excellence.pptx by DataScienceConferenc1
[DSC Europe 23] Stefan Mrsic_Goran Savic - Evolving Technology Excellence.pptx[DSC Europe 23] Stefan Mrsic_Goran Savic - Evolving Technology Excellence.pptx
[DSC Europe 23] Stefan Mrsic_Goran Savic - Evolving Technology Excellence.pptx
CRM stick or twist.pptx by info828217
CRM stick or twist.pptxCRM stick or twist.pptx
CRM stick or twist.pptx
info82821710 views
[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented Generation by DataScienceConferenc1
[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented Generation[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented Generation
[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented Generation
UNEP FI CRS Climate Risk Results.pptx by pekka28
UNEP FI CRS Climate Risk Results.pptxUNEP FI CRS Climate Risk Results.pptx
UNEP FI CRS Climate Risk Results.pptx
pekka2811 views
[DSC Europe 23] Zsolt Feleki - Machine Translation should we trust it.pptx by DataScienceConferenc1
[DSC Europe 23] Zsolt Feleki - Machine Translation should we trust it.pptx[DSC Europe 23] Zsolt Feleki - Machine Translation should we trust it.pptx
[DSC Europe 23] Zsolt Feleki - Machine Translation should we trust it.pptx
Data about the sector workshop by info828217
Data about the sector workshopData about the sector workshop
Data about the sector workshop
info82821712 views
Organic Shopping in Google Analytics 4.pdf by GA4 Tutorials
Organic Shopping in Google Analytics 4.pdfOrganic Shopping in Google Analytics 4.pdf
Organic Shopping in Google Analytics 4.pdf
GA4 Tutorials14 views
Survey on Factuality in LLM's.pptx by NeethaSherra1
Survey on Factuality in LLM's.pptxSurvey on Factuality in LLM's.pptx
Survey on Factuality in LLM's.pptx
NeethaSherra16 views
CRM stick or twist workshop by info828217
CRM stick or twist workshopCRM stick or twist workshop
CRM stick or twist workshop
info8282179 views
Chapter 3b- Process Communication (1) (1)(1) (1).pptx by ayeshabaig2004
Chapter 3b- Process Communication (1) (1)(1) (1).pptxChapter 3b- Process Communication (1) (1)(1) (1).pptx
Chapter 3b- Process Communication (1) (1)(1) (1).pptx
ayeshabaig20046 views
Data Journeys Hard Talk workshop final.pptx by info828217
Data Journeys Hard Talk workshop final.pptxData Journeys Hard Talk workshop final.pptx
Data Journeys Hard Talk workshop final.pptx
info82821710 views
Ukraine Infographic_22NOV2023_v2.pdf by AnastosiyaGurin
Ukraine Infographic_22NOV2023_v2.pdfUkraine Infographic_22NOV2023_v2.pdf
Ukraine Infographic_22NOV2023_v2.pdf
AnastosiyaGurin1.4K views

Know your R usage workflow to handle reproducibility challenges

  • 1. Copyright (c) WLOG Solutions Know your R usage workflow to handle reproducibility challenges Budapest, 2018
  • 2. Copyright (c) WLOG Solutions Kate and Henry Freelancer/scientist/ consultant The Team Corporate/ In-house team Meet Personas John Student/hobbyist
  • 3. Copyright (c) WLOG Solutions They were coding in R happily until that one day...
  • 4. Copyright (c) WLOG Solutions https://xkcd.com/234/
  • 5. Copyright (c) WLOG Solutions John Could not deliver R labs homework due to package incompatibility at professors laptop.
  • 6. Copyright (c) WLOG Solutions Kate and Henry Missed deadlines due to problems installing packages for their R shiny app at Customer’s Server running RedHat Enterprise 6.8.
  • 7. Copyright (c) WLOG Solutions The Team Had serious issues with package versions conflicts due to many users, many projects, running RedHat Enteprise machine without internet access.
  • 8. Copyright (c) WLOG Solutions Three different stories the same reproducibility problem.
  • 9. Copyright (c) WLOG Solutions What is reproducibility?
  • 10. Copyright (c) WLOG Solutions Reproducibility is the ability to run your code repeatedly, at different time, using different computer, in such way to obtain the same outputs given the same inputs.
  • 11. Copyright (c) WLOG Solutions Reproducibility is the ability to run a code repeatedly, at different time, using different computer, in such way to obtain the same outputs given the same inputs.
  • 12. Copyright (c) WLOG Solutions Reproducibility is the ability to run your code repeatedly, at different time, using different computer, in such way to obtain the same outputs given the same inputs.
  • 13. Copyright (c) WLOG Solutions Reproducibility is the ability to run your code repeatedly, at different time, at different computer, in such way to obtain the same outputs given the same inputs.
  • 14. Copyright (c) WLOG Solutions Reproducibility is the ability to run your code repeatedly, at different time, using different computer, in such way to obtain the same outputs given the same inputs.
  • 15. Copyright (c) WLOG Solutions Bare metal Operating system Solution dependencies Code Data
  • 16. Copyright (c) WLOG Solutions Few examples
  • 17. Copyright (c) WLOG Solutions 17 forecast v7.2 - ggplot2 (>= 2.0.0) - Rcpp (>= 0.11) - Added gglagplot R 3.3.1 2016-01-03 2016-09-08 forecast v6.2 - Rcpp (>= 0.11) R 3.2.3 forecast v8.0 - ggplot2 (>= 2.0.0) - Rcpp (>= 0.11) - Modified defaults for gglagplot R 3.3.2 2017-03-01
  • 18. Copyright (c) WLOG Solutions 18
  • 19. Copyright (c) WLOG Solutions Development Production
  • 20. Copyright (c) WLOG Solutions I recommend using rocker/r-ver
  • 21. Copyright (c) WLOG Solutions When is reproducibility important while you program in R?
  • 22. Copyright (c) WLOG Solutions Debian/Ubuntu RedHat/Centos Windows Debian/Ubuntu RedHat/Centos Windows Development Production Deploy (share) solution to production
  • 23. Copyright (c) WLOG Solutions Debian/Ubuntu RedHat/Centos Windows Debian/Ubuntu RedHat/Centos Windows Development Development’ Restore development environment
  • 24. Copyright (c) WLOG Solutions Three workflows three reproducibility solutions.
  • 25. Copyright (c) WLOG Solutions John, student/hobbyist Dev/Production Version controlFamily&Friends or Professor MRAN
  • 26. Copyright (c) WLOG Solutions Kate and Henry, consultancy team/freelancer/scientist DevProduction Continuous integration Version control Local CRAN MRAN On-premise Cloud Spark etc.
  • 27. Copyright (c) WLOG Solutions The Team, corporate/in-house team DevProduction Continuous integration Version control Local CRAN
  • 28. Copyright (c) WLOG Solutions One word on Docker Development Production Build for different OS Deployment package . zip
  • 29. Copyright (c) WLOG Solutions Second word on Docker Development Production Build Docker image
  • 30. Copyright (c) WLOG Solutions CRAN management Multiple R versions Debian/Ubuntu Windows RedHat/CenOS Docker Jenkins Isolated projects http://rsuite.io https://github.com/WLOGSolutions/RSuite https://www.slideshare.net/WLOGSolutions No installation on prod Internetless environments System requirements Git/SVN Binary packages