1ownR Enterprise R platform
ownR Enterprise Analytics Platform
for R and Python
A technical introduction
2ownR Enterprise R platform
Motivation
1. Can new hires get set up in the environment to run analyses
on their first day?
2. Can data scientists utilize the latest tools/packages without
help from IT?
3. Can data scientists use on-demand and scalable compute
resources without help from IT/dev ops?
4. Can data scientists find and reproduce past experiments and
results, using the original code, data, parameters, and
software versions?
The “Joel Test” for Data Science
3ownR Enterprise R platform
Motivation
5. Does collaboration happen through a system other than
email?
6. Can predictive models be deployed to production without
custom engineering or infrastructure work?
7. Is there a single place to search for past research and
reusable data sets, code, etc.?
8. Do your data scientists use the best tools money can buy?
Source: https://blog.dominodatalab.com/joel-test-data-science/
+1 Can you access your statistics without using R/Python?
The “Joel Test” for Data Science
4ownR Enterprise R platform
Ideal Process
From code to production
SCM
Data scientist /
Developer:
• commits code
5ownR Enterprise R platform
Ideal Process
From code to production
SCM Package
CI / CD
• Documentation
• Unit tests
• Build
6ownR Enterprise R platform
Ideal Process
From code to production
SCM Package Repo
CI / CD
• Upload to
repository
7ownR Enterprise R platform
Ideal Process
From code to production
SCM Package Repo
Prod
CI/CD
• Deploy the
application
8ownR Enterprise R platform
Ideal Process
From code to production
SCM Package Repo
ProdProdProd
Scale the
application
Load Balancer
9ownR Enterprise R platform
Ideal Process
From code to production
SCM Package Repo
ProdProdProd
Load Balancer
Excel or web
front-end
Scheduler Secure
Shiny app
REST API REST APINon-technical end
user’s access:
10ownR Enterprise R platform
ownR repository
1. Multiple repositories
• Shared and Private repositories
• Local public repository mirrors
2. Share your applications with colleagues for reusability
3. Search public and internal packages
4. Dependency management across all repositories
5. Maintain multiple (historical) versions of the same package
The application repository of the ownR platform
11ownR Enterprise R platform
ownR repository
1. Can new hires get set up in the environment to run analyses
on their first day?
2. Can data scientists utilize the latest tools/packages without
help from IT? ✅
3. Can data scientists use on-demand and scalable compute
resources without help from IT/dev ops?
4. Can data scientists find and reproduce past experiments and
results, using the original code, data, parameters, and
software versions?
The “Joel Test” for Data Science
12ownR Enterprise R platform
ownR repository
5. Does collaboration happen through a system other than
email? ✅
6. Can predictive models be deployed to production without
custom engineering or infrastructure work?
7. Is there a single place to search for past research and
reusable data sets, code, etc.? ✅
8. Do your data scientists use the best tools money can buy?
+1 Can you access your statistics without using R/Python?
The “Joel Test” for Data Science
13ownR Enterprise R platform
roveR
1. Preconfigured, separated R environments
2. Open-source R package
3. Linking R environments to the ownR repository
4. Document the dependencies in a DESCRIPTION file
5. Install dependencies with specific versions from the repository
Note: Python solves this with virtualenv, YAML and pip.
Open source R virtual environment
14ownR Enterprise R platform
roveR + ownR repository
1. Can new hires get set up in the environment to run analyses
on their first day? ✅
2. Can data scientists utilize the latest tools/packages without
help from IT? ✅
3. Can data scientists use on-demand and scalable compute
resources without help from IT/dev ops?
4. Can data scientists find and reproduce past experiments and
results, using the original code, data, parameters, and
software versions? ✅
The “Joel Test” for Data Science
15ownR Enterprise R platform
roveR + ownR repository
5. Does collaboration happen through a system other than
email? ✅
6. Can predictive models be deployed to production without
custom engineering or infrastructure work? ✅
7. Is there a single place to search for past research and
reusable data sets, code, etc.? ✅
8. Do your data scientists use the best tools money can buy?
+1 Can you access your statistics without using R/Python? ✅
The “Joel Test” for Data Science
16ownR Enterprise R platform
ownR production applications
1. Access deployed R & Python applications via REST API
2. Allows other applications to integrate analytics calculations
developed in R or Python without porting to a third language
3. Also supporting Flask and Shiny apps for web applications
4. Scalable solution with load balancing
5. Secure access out-of-the-box
• Using API keys for API applications
• User authentication & authorization for web applications
Scalable secure R & Python applications
17ownR Enterprise R platform
ownR platform
1. Can new hires get set up in the environment to run analyses
on their first day? ✅
2. Can data scientists utilize the latest tools/packages without
help from IT? ✅
3. Can data scientists use on-demand and scalable compute
resources without help from IT/dev ops? ✅
4. Can data scientists find and reproduce past experiments and
results, using the original code, data, parameters, and
software versions? ✅
The “Joel Test” for Data Science
18ownR Enterprise R platform
ownR platform
5. Does collaboration happen through a system other than
email? ✅
6. Can predictive models be deployed to production without
custom engineering or infrastructure work? ✅
7. Is there a single place to search for past research and
reusable data sets, code, etc.? ✅
8. Do your data scientists use the best tools money can buy? ✅
+1 Can you access your statistics without using R/Python? ✅
The “Joel Test” for Data Science
19ownR Enterprise R platform
ownR workflow
20ownR Enterprise R platform
ownR workflow
21ownR Enterprise R platform
ownR workflow
22ownR Enterprise R platform
Questions?
info@ownr.io

ownR platform extended technical introduction

  • 1.
    1ownR Enterprise Rplatform ownR Enterprise Analytics Platform for R and Python A technical introduction
  • 2.
    2ownR Enterprise Rplatform Motivation 1. Can new hires get set up in the environment to run analyses on their first day? 2. Can data scientists utilize the latest tools/packages without help from IT? 3. Can data scientists use on-demand and scalable compute resources without help from IT/dev ops? 4. Can data scientists find and reproduce past experiments and results, using the original code, data, parameters, and software versions? The “Joel Test” for Data Science
  • 3.
    3ownR Enterprise Rplatform Motivation 5. Does collaboration happen through a system other than email? 6. Can predictive models be deployed to production without custom engineering or infrastructure work? 7. Is there a single place to search for past research and reusable data sets, code, etc.? 8. Do your data scientists use the best tools money can buy? Source: https://blog.dominodatalab.com/joel-test-data-science/ +1 Can you access your statistics without using R/Python? The “Joel Test” for Data Science
  • 4.
    4ownR Enterprise Rplatform Ideal Process From code to production SCM Data scientist / Developer: • commits code
  • 5.
    5ownR Enterprise Rplatform Ideal Process From code to production SCM Package CI / CD • Documentation • Unit tests • Build
  • 6.
    6ownR Enterprise Rplatform Ideal Process From code to production SCM Package Repo CI / CD • Upload to repository
  • 7.
    7ownR Enterprise Rplatform Ideal Process From code to production SCM Package Repo Prod CI/CD • Deploy the application
  • 8.
    8ownR Enterprise Rplatform Ideal Process From code to production SCM Package Repo ProdProdProd Scale the application Load Balancer
  • 9.
    9ownR Enterprise Rplatform Ideal Process From code to production SCM Package Repo ProdProdProd Load Balancer Excel or web front-end Scheduler Secure Shiny app REST API REST APINon-technical end user’s access:
  • 10.
    10ownR Enterprise Rplatform ownR repository 1. Multiple repositories • Shared and Private repositories • Local public repository mirrors 2. Share your applications with colleagues for reusability 3. Search public and internal packages 4. Dependency management across all repositories 5. Maintain multiple (historical) versions of the same package The application repository of the ownR platform
  • 11.
    11ownR Enterprise Rplatform ownR repository 1. Can new hires get set up in the environment to run analyses on their first day? 2. Can data scientists utilize the latest tools/packages without help from IT? ✅ 3. Can data scientists use on-demand and scalable compute resources without help from IT/dev ops? 4. Can data scientists find and reproduce past experiments and results, using the original code, data, parameters, and software versions? The “Joel Test” for Data Science
  • 12.
    12ownR Enterprise Rplatform ownR repository 5. Does collaboration happen through a system other than email? ✅ 6. Can predictive models be deployed to production without custom engineering or infrastructure work? 7. Is there a single place to search for past research and reusable data sets, code, etc.? ✅ 8. Do your data scientists use the best tools money can buy? +1 Can you access your statistics without using R/Python? The “Joel Test” for Data Science
  • 13.
    13ownR Enterprise Rplatform roveR 1. Preconfigured, separated R environments 2. Open-source R package 3. Linking R environments to the ownR repository 4. Document the dependencies in a DESCRIPTION file 5. Install dependencies with specific versions from the repository Note: Python solves this with virtualenv, YAML and pip. Open source R virtual environment
  • 14.
    14ownR Enterprise Rplatform roveR + ownR repository 1. Can new hires get set up in the environment to run analyses on their first day? ✅ 2. Can data scientists utilize the latest tools/packages without help from IT? ✅ 3. Can data scientists use on-demand and scalable compute resources without help from IT/dev ops? 4. Can data scientists find and reproduce past experiments and results, using the original code, data, parameters, and software versions? ✅ The “Joel Test” for Data Science
  • 15.
    15ownR Enterprise Rplatform roveR + ownR repository 5. Does collaboration happen through a system other than email? ✅ 6. Can predictive models be deployed to production without custom engineering or infrastructure work? ✅ 7. Is there a single place to search for past research and reusable data sets, code, etc.? ✅ 8. Do your data scientists use the best tools money can buy? +1 Can you access your statistics without using R/Python? ✅ The “Joel Test” for Data Science
  • 16.
    16ownR Enterprise Rplatform ownR production applications 1. Access deployed R & Python applications via REST API 2. Allows other applications to integrate analytics calculations developed in R or Python without porting to a third language 3. Also supporting Flask and Shiny apps for web applications 4. Scalable solution with load balancing 5. Secure access out-of-the-box • Using API keys for API applications • User authentication & authorization for web applications Scalable secure R & Python applications
  • 17.
    17ownR Enterprise Rplatform ownR platform 1. Can new hires get set up in the environment to run analyses on their first day? ✅ 2. Can data scientists utilize the latest tools/packages without help from IT? ✅ 3. Can data scientists use on-demand and scalable compute resources without help from IT/dev ops? ✅ 4. Can data scientists find and reproduce past experiments and results, using the original code, data, parameters, and software versions? ✅ The “Joel Test” for Data Science
  • 18.
    18ownR Enterprise Rplatform ownR platform 5. Does collaboration happen through a system other than email? ✅ 6. Can predictive models be deployed to production without custom engineering or infrastructure work? ✅ 7. Is there a single place to search for past research and reusable data sets, code, etc.? ✅ 8. Do your data scientists use the best tools money can buy? ✅ +1 Can you access your statistics without using R/Python? ✅ The “Joel Test” for Data Science
  • 19.
    19ownR Enterprise Rplatform ownR workflow
  • 20.
    20ownR Enterprise Rplatform ownR workflow
  • 21.
    21ownR Enterprise Rplatform ownR workflow
  • 22.
    22ownR Enterprise Rplatform Questions? info@ownr.io