Strata - Scaling Jupyter with Jupyter Enterprise Gateway

Scaling Jupyter with
Jupyter Enterprise Gateway
Luciano Resende
Alan Chin
CODAIT - IBM

About me – Alan Chin
Sr. Software Engineer – Build and Infrastructure – CODAIT
• Over 3 years working with Open Source Projects (Apache SystemML, Apache Spark,
Apache Ambari
• Currently Contributing to the Jupyter Enterprise Gateway Project
akchin@us.ibm.com
https://www.linkedin.com/in/alankchin/
@AlanChin11
https://github.com/akchinSTC
IBM Developer / © 2019 IBM Corporation 2

About me - Luciano Resende
Open Source AI Platform Architect – IBM – CODAIT
• Senior Technical Staff Member at IBM, contributing to open source for over 10 years
• Currently contributing to : Jupyter Notebook ecosystem, Apache Bahir, Apache
Toree, Apache Spark among other projects related to AI/ML platforms
lresende@us.ibm.com
https://www.linkedin.com/in/lresende
@lresende1975
https://github.com/lresende

IBM Open Source Contributions
IBM Developer / © 2019 IBM Corporation
Learn
Open Source @ IBM
Program touches
78,000
IBMers annually
Consume
Virtually all
IBM products
contain some
open source
• 40,363 pkgs
Per Year
Contribute
• >62K OS Certs per
year
• ~10K IBM commits
per month
• 1500+ GitHub repos
Connect
> 1000
active IBM
Contributors
Working in key OS
projects
4

IBM Open Source
Participation
IBM generated open source innovation
• 137 IBM Open Code projects w/1500+ Github projects
• Projects that have graduated into full open governance:
Jupyter Enterprise Gateway, Node-Red, OpenWhisk,
Apache SystemML, Blockchain Fabric
• https://developer.ibm.com/code/open/code/
Community
• IBM focused on 18 strategic communities
• Drive open governance in “Centers of Gravity”
• IBM Leaders drive key technologies and assure freedom
of action
The IBM OS Way is now open sourced
• Training, Recognition, Tooling
• Organization, Consuming, Contributing
5IBM Developer / © 2019 IBM Corporation

Center for Open Source
Data and AI
Technologies
6
CODAIT aims to make AI solutions
dramatically easier to create, deploy,
and manage in the enterprise
Relaunch of the Spark Technology
Center (STC) to reflect expanded
mission
CODAIT
codait.org
codait (French)
= coder/coded
https://m.interglot.com/fr/en/codait

Jupyter Notebooks

Jupyter Notebooks
Notebooks are interactive
computational environments,
in which you can combine
code execution, rich text,
mathematics, plots and rich
media.

Jupyter Notebook Platform Architecture
Notebook UI runs on the browser
The Notebook Server serves the
‘Notebooks’
Kernels interpret/execute cell contents
Are responsible for code execution
Abstracts different languages
1:1 relationship with Notebook
Runs and consume resources as long as
notebook is running

Jupyter Notebook
Interactive Workloads

Analytics Workloads
• Large amount of data
• Shared across organization in Data
Lakes
• Multiple workload types
Data cleansing
Data Warehouse
Machine Learning and Insights

AI / Deep Learning Workloads
Resource intensive workloads
Requires expensive hardware (GPU,
TPU)
Long Running training jobs
Simple MNIST takes over one hour
WITHOUT a decent GPU
Other non complex deep learning
model training can easily take over a
day WITH GPUs

Local Development Environment

Analytic and AI
Platforms
Large pool of shared computing
resources
- Enterprise Cloud, Public Cloud or Hybrid
- Shared Data (Data Lakes/Object Storage)
Distributed Consumers
- Notebooks running local (users laptop)
or as a service (e.g. Jupyter Hub)
Different Resource Utilization Patterns
- High number of idle resources

Jupyter Notebook Stack
Limitations
Kernel
Kernel
Kernel
Kernel
Kernel
Scalability
- Jupyter Kernels running as local process
- Resources are limited by what is available
on the one single node that runs all Kernels
and associated Spark drivers
Security
- Single user sharing the same privileges
- Users can see and control each other process
using Jupyter administrative utilities
8 8 8 8
0
10
20
30
40
50
60
70
80
4 Nodes 8 Nodes 12 Nodes 16 Nodes
MaxKernels(4GBHeap)
Cluster Size (32GB Nodes)
MAXIMUM NUMBER OF SIMULTANEOUS KERNELS

Jupyter Enterprise Gateway website
https://Jupyter.org/enterprise_gateway/
Jupyter Enterprise Gateway source code at GitHub
https://github.com/jupyter-incubator/enterprise_gateway
Jupyter Enterprise Gateway Documentation
http://jupyter-enterprise-gateway.readthedocs.io/en/latest/
Supported Kernels
Supported Platforms
Spectrum Conductor
+
A lightweight, multi-tenant,
scalable and secure gateway
that enables Jupyter
Notebooks to share resources
across an Apache Spark or
Kubernetes cluster for
Enterprise/Cloud use cases
+

Jupyter Enterprise Gateway Features
Optimized Resource Allocation
Utilize resources on all cluster nodes by running kernels
as Spark applications in YARN Cluster Mode.
Pluggable architecture to enable support for additional
Resource Managers
Enhanced Security
End-to-End secure communications
- Secure socket communications
- Encrypted HTTP communication using SSL
Multiuser support with user
impersonation
Enhance security and sandboxing by enabling user
impersonation when running kernels (using Kerberos).
Individual HDFS home folder for each notebook user.
Use the same user ID for notebook and batch jobs.
Kernel
Kernel Kernel
Kernel
Kernel
Kernel
Kernel
16
32
48
64
0
10
20
30
40
50
60
70
80
4 Nodes 8 Nodes 12 Nodes 16 Nodes
MaxKernels(4GBHeap)
Cluster Size (32GB Nodes)
MAXIMUM NUMBER OF SIMULTANEOUS KERNELS

Jupyter Notebooks
and Kubernetes

Deep Learning Workloads
Resource Intensive workloads
Requires expensive hardware (GPU,
TPU)
Long Running training jobs
- Simple MNIST takes over one hour
WITHOUT a decent GPU
- Other non complex deep learning model
training can easily take over a day WITH
GPUs

Jupyter & Kubernetes
Kubernetes Platform
- Containers provides a flexible way to
deploy applications and are here to stay
- Containers simplify management of
complicated and heterogenous AI/Deep
Learning infrastructure
- Kubernetes enables easy management of
containerized applications and resources
with the benefit of Elasticity and Quality of
Services
Source: https://github.com/Langhalsdino/Kubernetes-GPU-Guide

Enterprise Gateway
& Kubernetes
Supported Platforms
Before Jupyter Enterprise Gateway …
- Resources required for all kernels needs to
be allocated during Notebook Server pod
creation
- Resources limited to what is physically
available on the host node that runs all
kernels and associated Spark drivers
After Jupyter Enterprise Gateway …
- Gateway pod very lightweight
- Kernels in their own pod, isolation
- Kernel pods built from community images:
Spark-on-K8s, TensorFlow, Keras, etc.
FfDL
Before Enterprise Gateway After Enterprise Gateway

Bob
Alice
Jupyter
Enterprise
Gateway
Bob
Alice
Container images defined in kernelspec
Community image
Kernel
Spark on Kubernetes
Kernel
Jupyter Enterprise Gateway - Kubernetes

Bob
Alice
Jupyter
Enterprise
Gateway
Bob
Alice
Container images defined in kernelspec
JupyterHub will provision
custom images containing
Notebook + NB2KG
extension
JupyterLab
Jupyter
Notebook
Community image
Kernel
Spark on Kubernetes
Kernel
Jupyter Enterprise Gateway - Kubernetes

Jupyter & Kubernetes
• Multi-user Enterprise Gateway pod
• Each kernel launched on it’s own pod
• Kernel pod namespace is configurable

Configuration
Jupyter Kernels are configured by
kernelspecs
- Each kernel has a correspondent kernelspec
- Stored in one of the Jupyter data path
directory
- $ jupyter kernelspec list
/…/anaconda3/share/jupyter/kernels/python2/kernel.jsom

Configurations
Process Proxy:
• Abstracts kernel process represented by Jupyter
framework
• Pluggable class definition identified in kernelspec
(kernel.json)
• Manages kernel lifecycle
Kernel Launcher:
• Embeds target kernel
• Listens on gateway communication port
• Conveys interrupt requests (via local signal)
• Could be extended for additional communications
{
"language": "python",
"display_name": "Spark - Python (Kubernetes Mode)",
"process_proxy": {
"class_name":
"enterprise_gateway.services.processproxies.k8s.KubernetesProcessProxy",
"config": {
"image_name": "elyra/kubernetes-kernel-py:dev",
"executor_image_name": "elyra/kubernetes-kernel-py:dev”,
"port_range" : "40000..42000"
}
},
"env": {
"SPARK_HOME": "/opt/spark",
"SPARK_OPTS": "--master k8s://https://${KUBERNETES_SERVICE_HOST --deploy-
mode cluster --name …",
…
},
"argv": [
"/usr/local/share/jupyter/kernels/spark_scala_yarn_cluster/bin/run.sh",
"--RemoteProcessProxy.kernel-id",
"{kernel_id}",
"--RemoteProcessProxy.response-address",
"{response_address}",
"--RemoteProcessProxy.port-range",
"{port_range}",
"--RemoteProcessProxy.spark-context-initialization-mode",
"lazy"
]
}

Spectrum Conductor
+
Supported
Runtime
Platforms
J U P Y T E R E N T E R P R I S E G A T E W A Y
Remote
Kernel Manager
Distributed
Process Proxy
YARN Cluster
Process Proxy
Kubernetes
Process Proxy
Conductor Cluster
Process Proxy
J U P Y T E R N O T E B O O K UI
NB2KG Extension
J U P Y T E R K E R N E L G A T E W A Y
J U P Y T E R N O T E B O O K
FfDL
P R O G R A M M A T I C A P I
Docker
Process Proxy
Jupyter Enterprise Gateway Components
+
With Notebook
6.0, the NB2KG
capabilities have
been integrated
into the Notebook
server.
For
programmatically
access, we have a
experimental
Enterprise
Gateway client
that enable
request a kernel
and submit code
very simply.

Summary

Interactive Workloads
across Kubernetes Cluster
+
• Enable support to
remote kernels in order
to scale Notebook
across entire cluster
• Multitenant with support
for user impersonation
leveraging Kerberos
• Base container image
becomes a choice (e.g.
Python with Tensorflow)
J U P Y T E R
E N T E R P R I S E G A T E W A Y
S U P P O R T E D
K E R N E L S
S U P P O R T E D
R U N T I M E S
+

Other resources
https://Jupyter.org/enterprise_gateway/
Jupyter Enterprise Gateway source code at GitHub
https://github.com/jupyter/enterprise_gateway
Jupyter Enterprise Gateway Documentation
http://jupyter-enterprise-gateway.readthedocs.io/en/latest/
Jupyter Enterprise Gateway Gitter
https://gitter.im/jupyter/enterprise_gateway
Jupyter Blog
https://blog.jupyter.org/
Stable Release - EG 1.2.0
(Analytics Workload with Spark running
YARN cluster mode support)
pip install jupyter_enterprise_gateway
Beta Release - EG 2.0.0 RC1
Introduce support for AI Workloads on
Kubernetes
pip install --pre jupyter_enterprise_gateway
STAR
US
&
FORK
US
ON
GITHUB

Thank you!
@lresende1975
@AlanChin11

Strata - Scaling Jupyter with Jupyter Enterprise Gateway

More Related Content

What's hot

Similar to Strata - Scaling Jupyter with Jupyter Enterprise Gateway

More from Luciano Resende

Recently uploaded

Strata - Scaling Jupyter with Jupyter Enterprise Gateway