Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
© 2016 Continuum Analytics - Confidential & Proprietary© 2017 Continuum Analytics - Confidential & Proprietary
Three Ways ...
© 2017 Continuum Analytics - Confidential & Proprietary
• Worked on MEMEX, a DARPA-funded project helping stop human
traff...
© 2017 Continuum Analytics - Confidential & Proprietary
• Developing the cluster management features of Anaconda
• 10+ yea...
© 2017 Continuum Analytics - Confidential & Proprietary 4
• Overview of Anaconda
• End-to-End Collaborative Data Science W...
Overview of Anaconda
© 2017 Continuum Analytics - Confidential & Proprietary 6
Anaconda, the leading Data Science ecosystem with over 4M users
© 2017 Continuum Analytics - Confidential & Proprietary 7
Numba
dask
xlwings
Airflow
Blaze
Distributed 

Systems
Business ...
© 2016 Continuum Analytics - Confidential & Proprietary 8
https://www.continuum.io/downloads
© 2016 Continuum Analytics - Confidential & Proprietary 9
What’s in ANACONDA DISTRIBUTION?
© 2016 Continuum Analytics - Confidential & Proprietary 10
• Install data science libraries
$ conda install pandas
• Manag...
© 2016 Continuum Analytics - Confidential & Proprietary 11
…
© 2016 Continuum Analytics - Confidential & Proprietary 12
anaconda-project.yml
• Define and manage:
• project package dep...
© 2016 Continuum Analytics - Confidential & Proprietary 13
• Launch applications
• Manage package
versions and
environment...
End-to-end Collaborative Data Science Workflows
© 2017 Continuum Analytics - Confidential & Proprietary 15
• Explore, Analyze & Collaborate
• Scale, Deploy & Operate
End-...
© 2017 Continuum Analytics - Confidential & Proprietary 16
Biz Analyst
Data Scientists
Explore, Analyze & Collaborate
© 2017 Continuum Analytics - Confidential & Proprietary 17
DevOps
Scale, Deploy & Operate
Developer
Data Engineers
Data Science Development and Deployment
© 2017 Continuum Analytics - Confidential & Proprietary 19
How do you…
• Download and install data science libraries?
• Ma...
© 2016 Continuum Analytics - Confidential & Proprietary 20
What do data scientists develop?
Workflows
Data
Query Visualize...
© 2016 Continuum Analytics - Confidential & Proprietary 21
Laptop
Data Science Development
scikit-learn
Bokeh Tensorflow
J...
© 2017 Continuum Analytics - Confidential & Proprietary 22
How do you…
• Share your data science project with others?
• En...
© 2016 Continuum Analytics - Confidential & Proprietary 23
The Path to Simple Data Science Deployment!
Anaconda Enterprise...
Anaconda and Docker
- Better Together
© 2016 Continuum Analytics - Confidential & Proprietary
Laptop
conda env 1
Analysis
1
conda env 2 conda env 3
Analysis
2
A...
© 2016 Continuum Analytics - Confidential & Proprietary 26
https://hub.docker.com/r/continuumio/anaconda/
© 2016 Continuum Analytics - Confidential & Proprietary
• Dependencies
Anaconda and Docker
27
• Data
• Deployment commands...
Portable Data Science with Anaconda Project
- More than just Dockerfiles
© 2016 Continuum Analytics - Confidential & Proprietary
Laptop Server
Project 1 Project 2 Project 3 Project 1 Project 2 Pr...
© 2016 Continuum Analytics - Confidential & Proprietary
Laptop
Server
Project 1 Project 2 Project 3 Project 1 Project 2 Pr...
© 2016 Continuum Analytics - Confidential & Proprietary
• Dependencies
• Data
• Deployment commands
Anaconda Project
31
• ...
One-click Data Science Deployments with
Anaconda Enterprise
© 2016 Continuum Analytics - Confidential & Proprietary
Laptop
Project 1 Project 2 Project 3
Project 1 Project 2 Project 3...
© 2016 Continuum Analytics - Confidential & Proprietary
• Dependencies
• Data
• Deployment commands
• Security
• Scalabili...
© 2016 Continuum Analytics - Confidential & Proprietary 35
© 2017 Continuum Analytics - Confidential & Proprietary 36
• One-click deployment of:
• Self-Service Data Science Notebook...
© 2017 Continuum Analytics - Confidential & Proprietary 37
• Ability to deploy apps and APIs that can be used/consumed via...
Example Data Science Deployments
© 2017 Continuum Analytics - Confidential & Proprietary 39
• 1) Self-service notebooks
• 2) Interactive visualizations and...
© 2017 Continuum Analytics - Confidential & Proprietary 40
• Self-service data science notebooks, including:
• Python
• R
...
© 2017 Continuum Analytics - Confidential & Proprietary 41
• Deploy apps using any visualization package in Anaconda, incl...
© 2017 Continuum Analytics - Confidential & Proprietary 42
• Machine learning models and applications with REST APIs
• sci...
© 2017 Continuum Analytics - Confidential & Proprietary 43
• Deploy composable applications across your data science team
...
© 2017 Continuum Analytics - Confidential & Proprietary 44
• Can be built on top of machine learning libraries in Anaconda...
Getting Started with Anaconda Enterprise
for Data Science Deployments and More
© 2017 Continuum Analytics - Confidential & Proprietary 46
Anaconda Platform
Anaconda Distribution
Anaconda Support
Anacon...
© 2017 Continuum Analytics - Confidential & Proprietary 47
• Empower data scientists to easily deploy secure and scalable ...
© 2017 Continuum Analytics - Confidential & Proprietary 48
• Get started with the Anaconda Enterprise Innovator program
• ...
© 2017 Continuum Analytics - Confidential & Proprietary 49
Questions?
Christine Doig
@ch_doig
Kristopher Overholt
@koverho...
Upcoming SlideShare
Loading in …5
×

3 Ways to Move Your Data Science Projects to Production: Secure and Scalable Data Science Deployment with Anaconda

6,505 views

Published on

Traditional data science project deployments involve lengthy and complex processes to deliver secure and scalable applications in enterprise environments. As a result, data scientists must spend a nontrivial amount of time setting up, configuring and maintaining deployment infrastructure.

Why take valuable time away from data exploration and analysis workflows when Anaconda can do that for you?

We are here to help—with Anaconda, you can easily productionize your data science projects and applications, and choose the deployment method to use. In this live webinar, Christine and Kris will demonstrate how Anaconda empowers data scientists to encapsulate and deploy their data science projects as live applications with a single click.

They will show you how to:
- Decide the right deployment strategy for you and your team
- Encapsulate data science projects with Anaconda Project
- Deploy Self-Service Notebooks, Interactive Applications (e.g. Bokeh), Dashboards and Machine Learning Models with REST API using Anaconda Enterprise

Published in: Technology

3 Ways to Move Your Data Science Projects to Production: Secure and Scalable Data Science Deployment with Anaconda

  1. 1. © 2016 Continuum Analytics - Confidential & Proprietary© 2017 Continuum Analytics - Confidential & Proprietary Three Ways to Move Your Data Science Projects to Production Secure and Scalable Data Science Deployment with Anaconda Christine Doig and Kris Overholt May 24, 2017
  2. 2. © 2017 Continuum Analytics - Confidential & Proprietary • Worked on MEMEX, a DARPA-funded project helping stop human trafficking • Co-author of the recently published book, Breaking Data Science Open, published by O’Reilly • 5+ years of experience in analytics, operations research, and machine learning • MS in Industrial Engineering, Polytechnic University of Catalonia, Barcelona. Christine Doig, Senior Data Scientist
  3. 3. © 2017 Continuum Analytics - Confidential & Proprietary • Developing the cluster management features of Anaconda • 10+ years of experience in scientific computing, systems administration, computational modeling and more • Ph.D. in Civil Engineering, University of Texas • Master’s degree, Worcester Polytechnic Institute, focus on computational fluid dynamics Kris Overholt, Product Manager
  4. 4. © 2017 Continuum Analytics - Confidential & Proprietary 4 • Overview of Anaconda • End-to-End Collaborative Data Science Workflows • Data Science Development and Deployment • Anaconda + Docker • Anaconda Project • Anaconda Enterprise • Examples of Data Science Deployment • Getting Started with Anaconda Enterprise Deployment Agenda
  5. 5. Overview of Anaconda
  6. 6. © 2017 Continuum Analytics - Confidential & Proprietary 6 Anaconda, the leading Data Science ecosystem with over 4M users
  7. 7. © 2017 Continuum Analytics - Confidential & Proprietary 7 Numba dask xlwings Airflow Blaze Distributed 
 Systems Business 
 Intelligence Web Scientific 
 Computing / HPC Machine Learning
 / Statistics ANACONDA DISTRIBUTION Python & R distribution with 1000+ curated packages that makes it easy to get started with Data Science
  8. 8. © 2016 Continuum Analytics - Confidential & Proprietary 8 https://www.continuum.io/downloads
  9. 9. © 2016 Continuum Analytics - Confidential & Proprietary 9 What’s in ANACONDA DISTRIBUTION?
  10. 10. © 2016 Continuum Analytics - Confidential & Proprietary 10 • Install data science libraries $ conda install pandas • Manage package versions $ conda install pandas=0.14 • Create isolated environments $ conda create -n myenv python=3.5 pandas=0.18 • Update package version $ conda update pandas
  11. 11. © 2016 Continuum Analytics - Confidential & Proprietary 11 …
  12. 12. © 2016 Continuum Analytics - Confidential & Proprietary 12 anaconda-project.yml • Define and manage: • project package dependencies • deployment commands • data • …
  13. 13. © 2016 Continuum Analytics - Confidential & Proprietary 13 • Launch applications • Manage package versions and environments • Create and upload projects
  14. 14. End-to-end Collaborative Data Science Workflows
  15. 15. © 2017 Continuum Analytics - Confidential & Proprietary 15 • Explore, Analyze & Collaborate • Scale, Deploy & Operate End-to-end Collaborative Data Science Workflows
  16. 16. © 2017 Continuum Analytics - Confidential & Proprietary 16 Biz Analyst Data Scientists Explore, Analyze & Collaborate
  17. 17. © 2017 Continuum Analytics - Confidential & Proprietary 17 DevOps Scale, Deploy & Operate Developer Data Engineers
  18. 18. Data Science Development and Deployment
  19. 19. © 2017 Continuum Analytics - Confidential & Proprietary 19 How do you… • Download and install data science libraries? • Manage versions and dependencies? • Upgrade libraries? • Isolate dependencies between projects? Challenges in the data science ecosystem
  20. 20. © 2016 Continuum Analytics - Confidential & Proprietary 20 What do data scientists develop? Workflows Data Query Visualize Clean & Tidy Predict, Simulate, & Optimize R P In N In A P M Interactive data visualizations and dashboards Jupyter notebooks Scripts Predictive models Processed Data
  21. 21. © 2016 Continuum Analytics - Confidential & Proprietary 21 Laptop Data Science Development scikit-learn Bokeh Tensorflow Jupyter pandas matplotlib seaborn dask numba script 1 script 2 notebook A dataset Z script 3 Python, R
  22. 22. © 2017 Continuum Analytics - Confidential & Proprietary 22 How do you… • Share your data science project with others? • Ensure that you can reproduce your analysis? • Deploy your project? Challenges in data science development and deployment
  23. 23. © 2016 Continuum Analytics - Confidential & Proprietary 23 The Path to Simple Data Science Deployment! Anaconda Enterprise DIY Anaconda Project Anaconda Docker containers conda env 1 conda env 2 conda env 3
  24. 24. Anaconda and Docker - Better Together
  25. 25. © 2016 Continuum Analytics - Confidential & Proprietary Laptop conda env 1 Analysis 1 conda env 2 conda env 3 Analysis 2 Analysis 3 Server conda env 1 Analysis 1 conda env 2 conda env 3 Analysis 2 Analysis 3 Docker container Data Science Development Data Science Deployment
  26. 26. © 2016 Continuum Analytics - Confidential & Proprietary 26 https://hub.docker.com/r/continuumio/anaconda/
  27. 27. © 2016 Continuum Analytics - Confidential & Proprietary • Dependencies Anaconda and Docker 27 • Data • Deployment commands • Security • Scalability • Availability
  28. 28. Portable Data Science with Anaconda Project - More than just Dockerfiles
  29. 29. © 2016 Continuum Analytics - Confidential & Proprietary Laptop Server Project 1 Project 2 Project 3 Project 1 Project 2 Project 3 Data Science Development Data Science Deployment
  30. 30. © 2016 Continuum Analytics - Confidential & Proprietary Laptop Server Project 1 Project 2 Project 3 Project 1 Project 2 Project 3 Data Science Development Data Science Deployment Docker container
  31. 31. © 2016 Continuum Analytics - Confidential & Proprietary • Dependencies • Data • Deployment commands Anaconda Project 31 • Security • Scalability • Availability
  32. 32. One-click Data Science Deployments with Anaconda Enterprise
  33. 33. © 2016 Continuum Analytics - Confidential & Proprietary Laptop Project 1 Project 2 Project 3 Project 1 Project 2 Project 3 Data Science Development Data Science Development and Deployment Anaconda Enterprise Container 1 Container 2 Container 3 Container 4
  34. 34. © 2016 Continuum Analytics - Confidential & Proprietary • Dependencies • Data • Deployment commands • Security • Scalability • Availability Anaconda Enterprise 34
  35. 35. © 2016 Continuum Analytics - Confidential & Proprietary 35
  36. 36. © 2017 Continuum Analytics - Confidential & Proprietary 36 • One-click deployment of: • Self-Service Data Science Notebooks (Python and R) • Interactive visualizations and dashboards (Bokeh, Shiny, etc.) • Machine learning models with REST APIs • Secure deployments to a cluster with end-to-end SSL • API wrapper for easily exposing inputs/outputs for models • Ability to securely share apps with other users, groups, and roles
 (LDAP, AD, SAML, Kerberos) Anaconda Enterprise Features - Data Science Deployment
  37. 37. © 2017 Continuum Analytics - Confidential & Proprietary 37 • Ability to deploy apps and APIs that can be used/consumed via a token • Ability to configure CPU/memory limits for deployed apps in system-wide configuration • Ability to fetch logs for each app with error handling, health checks, and automatic app restarts • Deployments can be backed by remote storage, databases, or Hadoop/ Spark • Cluster can be configured for high availability Anaconda Enterprise Features - Data Science Deployment
  38. 38. Example Data Science Deployments
  39. 39. © 2017 Continuum Analytics - Confidential & Proprietary 39 • 1) Self-service notebooks • 2) Interactive visualizations and dashboards • 3) Machine learning models with REST APIs • 4) Composable data science projects • 5) Machine learning models with visualization Examples Overview
  40. 40. © 2017 Continuum Analytics - Confidential & Proprietary 40 • Self-service data science notebooks, including: • Python • R • Notebooks with live, attached kernels • Can be used to share runnable versions of analyses • Share running notebooks with users, groups, and roles • Handle portability and manage dependencies with Anaconda Project Example 1 - Notebooks (Python/R)
  41. 41. © 2017 Continuum Analytics - Confidential & Proprietary 41 • Deploy apps using any visualization package in Anaconda, including: • Bokeh • Shiny apps • Datashader • deck.gl • Develop and share visualizations and dashboards • Include data in project or reference remote data and databases • Deploy visualization apps powered by Hadoop and Spark Example 2 - Interactive Visualizations
  42. 42. © 2017 Continuum Analytics - Confidential & Proprietary 42 • Machine learning models and applications with REST APIs • scikit-learn, Theano, Lasagne, Neon • Tensorflow (w/ GPU), Caffe, H2O • and many more! • Support for model scoring and prediction APIs from trained models • Compatible with web frameworks in Anaconda, including: • Flask, Django, Tornado, and more • Models can be shared or consumed via API tokens Example 3 - Machine Learning w/ APIs
  43. 43. © 2017 Continuum Analytics - Confidential & Proprietary 43 • Deploy composable applications across your data science team • Example end-to-end workflow with custom endpoints and API tokens: • Stage 1 - Data cleansing • Stage 2 - Anomaly detection • Stage 3 - Model scoring • Stage 4 - Interactive applications and dashboards • Stage 5 - Reports and file exports Example 4 - Composable Deployments
  44. 44. © 2017 Continuum Analytics - Confidential & Proprietary 44 • Can be built on top of machine learning libraries in Anaconda, including: • Tensorflow, H2O, and many more • Easily develop interactive applications and dashboards with existing frameworks • Handle inputs and outputs to machine learning models • Including complex visualization toolkits such as Tensorboard Example 5 - ML Models with Visualization
  45. 45. Getting Started with Anaconda Enterprise for Data Science Deployments and More
  46. 46. © 2017 Continuum Analytics - Confidential & Proprietary 46 Anaconda Platform Anaconda Distribution Anaconda Support Anaconda Enterprise • • • The most trusted Python distribution for data science Deploy Anaconda with Confidence. World class support for open source production environments. Enterprise-ready data science platform for end-to-end workflows, including governance, collaboration, and deployment.
  47. 47. © 2017 Continuum Analytics - Confidential & Proprietary 47 • Empower data scientists to easily deploy secure and scalable data science projects to production • World class support for open-source production environments • Securely govern and version control data science artifacts (projects, packages, installers) from development to production • Secure and scalable data science project collaboration • Manage Anaconda across a cluster and run data science projects backed by enterprise scalable compute and data sources • Bring the power of data science to Business Analysts Anaconda Enterprise Features
  48. 48. © 2017 Continuum Analytics - Confidential & Proprietary 48 • Get started with the Anaconda Enterprise Innovator program • https://go.continuum.io/anaconda-enterprise-innovator/ • Contact us at: • sales@continuum.io • https://www.continuum.io/contact-us Next Steps
  49. 49. © 2017 Continuum Analytics - Confidential & Proprietary 49 Questions? Christine Doig @ch_doig Kristopher Overholt @koverholt @ContinuumIO

×