SlideShare a Scribd company logo
How-to create a secured multi
tenancy for Clustered ML with
JupyterHub
Non-root + JupyterHub + Kerberos +
IPython Cluster as a service
Introduction
With this presentation you should be able to create a kerberos secured architecture for a
framework of an interactive data analysis and machine learning by using a
Jupyter/JupyterHub powered by IPython Clusters that enables the processing clustering
local and/or remote nodes.
Architecture
This architecture enables the following:
● Transparent data-science development
● User Authentication
● Authentication via Kerberos + SSH
● Upgrades on Cluster won’t affect the developments.
● Controlled access to the data and resources by Kerberos Tickets.
● Several coding API’s (Scala, R, Python, PySpark, etc…).
● Parallel Processing
● JupyterHub as service and non-root user
Architecture
Pre-Assumptions
1. Jupyter Machine hostname: cm1.localdomain
2. Controller Node hostname: cm1.localdomain Engine Node hostname: cm2.localdomain
3. Conda Python version: 3.8.5
4. Jupyter Machine Authentication Pre-Installed: Kerberos
a. Kerberos Realm DOMAIN.COM
5. JupyterHub Machine Authentication Not-Installed: Kerberos
6. Permissions user with root or sudo
7. MIT Kerberos installed on your windows machine
Miniconda
Add Anaconda User/Dir
adduser anaconda;
passwd anaconda;
mkdir /opt/anaconda;
Download and installation
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -P /tmp;
chmod +x /tmp/Miniconda3-latest-Linux-x86_64.sh;
/tmp/Miniconda3-latest-Linux-x86_64.sh -b -u -p /opt/anaconda;
Note 1: Change with your values in the highlighted field.
Note 2: JupyterHub requires Python 3.X, therefore it will be installed Anaconda 3
Add Permissions MiniConda
chown -R anaconda:anaconda /opt/anaconda;
chmod -R go-w /opt/anaconda && chmod -R go+rX /opt/anaconda;
mkdir -p /apps/anaconda/pkgs;
chown -R anaconda:anaconda /apps/anaconda/pkgs && chmod -R oug+rwx /apps;
Anaconda
Set Conda Bash Configurations
nano .bashrc;
export CONDA_PKGS_DIRS="/apps/anaconda/pkgs","/opt/anaconda/pkgs","/home/$USER/.conda/pkgs"
export CONDA_ENVS_DIRS="/apps/anaconda/$USER/envs"
# >>> conda initialize >>>
# !! Contents within this block are managed by 'conda init' !!
__conda_setup="$('/opt/anaconda/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
eval "$__conda_setup"
else
if [ -f "/opt/anaconda/etc/profile.d/conda.sh" ]; then
. "/opt/anaconda/etc/profile.d/conda.sh"
else
export PATH="/opt/anaconda/bin:$PATH"
fi
fi
unset __conda_setup
# <<< conda initialize <<<
conda config --set auto_update_conda False && conda config --add channels conda-forge;
conda config --set pip_interop_enabled True;
Note: Change with your values in the highlighted field.
Jupyter or JupyterHub?
JupyterHub it’s a multi-purpose notebook that:
● Manages authentication.
● Spawns single-user notebook on-demand.
● Gives each user a complete notebook
server.
How to choose?
JupyterHub
JupyterHub needs to be executed with root privileges or at least some root privileges (ie for example to access to the
pam passwords). Therefore we will need to configure a special user (with no password) that it will be used by the
sudospawner!
For this example we will set: user: jupyter | group: jupyterhub to execute the JupyterHub Server as a service. Any new
user that should access to Jupyter and Spawn Notebooks … must be added to the JupyterHub group.
Create User/Group to operate as Service
sudo useradd jupyter && sudo groupadd jupyterhub && sudo usermod jupyter -G jupyterhub;
Add jupyter to root group & Give Read Permissions (PAM)
sudo usermod -a -G root jupyter; sudo chmod g+r /etc/shadow;
Log as Jupyter user
su - jupyter;
Note 1: it’s only necessary to change the highlighted
JupyterHub
Set Conda Bash Configurations
Use the configurations on the Page 7.
Create Environment for JupyterHub
conda create -n jupyterhub_env;
Activate Environment for JupyterHub
conda activate jupyterhub_env;
Install JupyterHub Packages
conda install jupyterhub jupyterlab notebook configurable-http-proxy;
Install sudospawner Package
conda install -c conda-forge sudospawner;
Check sudospawner location
which sudospawner;
Note 1: it’s only necessary to change the highlighted
Create JupyterHub Directories
sudo mkdir /etc/jupyterhub;
sudo chown jupyter:jupyterhub /etc/jupyterhub;
Generate JupyterHub Config file
cd /etc/jupyterhub && jupyterhub --generate-config;
JupyterHub
Create/Edit sudoers config
sudo nano /etc/sudoers.d/jupytersudoers;
Runas_Alias JUPYTER_USERS = jupyter
Cmnd_Alias JUPYTER_CMD =
/apps/anaconda/jupyter/envs/jupyterhub_env/bin/sudospawner
%jupyterhub ALL=(jupyter) /usr/bin/sudo
jupyter ALL=(%jupyterhub) NOPASSWD:JUPYTER_CMD
Start JupyterHub Server With Config File
jupyterhub -f /etc/jupyterhub/jupyterhub_config.py;
Note: it’s only necessary to change the highlighted, ex: for your ip.
Create/Edit sudoers config
sudo nano /etc/sudoers.d/jupytersudoers;
import os
import pwd
import subprocess
def create_dir_hook(spawner):
if not os.path.exists(os.path.join('/home/', spawner.user.name)):
subprocess.call(["sudo", "/sbin/mkhomedir_helper",
spawner.user.name])
c.Spawner.pre_spawn_hook = create_dir_hook
c.JupyterHub.bind_url = 'http://10.111.22.333:8000'
c.JupyterHub.hub_bind_url = 'http://10.111.22.333:8081'
c.JupyterHub.hub_ip = '10.111.22.333’
c.JupyterHub.spawner_class = 'sudospawner.SudoSpawner'
c.SudoSpawner.sudospawner_path =
'/apps/anaconda/jupyter/envs/jupyterhub_env/bin/sudospawner'
c.Authenticator.admin_users = {'jupyter'}
JupyterHub
Create systemd JupyterHub Directory
sudo mkdir -p /home/jupyter/.config/systemd;
Create systemd JupyterHub service Configuration
sudo nano /home/jupyter/.config/systemd/jupyterhub.service;
[Unit]
Description=Jupyterhub Server
After=syslog.target network-online.target
[Service]
Type=simple
User=jupyter
ExecStart=/etc/jupyterhub/runJupyterhub.sh
WorkingDirectory=/etc/jupyterhub
Restart=on-failure
RestartSec=1min
TimeoutSec=5min
[Install]
WantedBy=multi-user.target
Note: it’s only necessary to change the highlighted
Create JupyterHub Script for Systemd
nano /etc/jupyterhub/runJupyterhub.sh;
#!/bin/bash
export
CONDA_PKGS_DIRS="/apps/anaconda/pkgs","/opt/anaconda/pkgs","/home/$USER/.
conda/pkgs"
export CONDA_ENVS_DIRS="/apps/anaconda/$USER/envs"
__conda_setup="$('/opt/anaconda/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
eval "$__conda_setup"
else
if [ -f "/opt/anaconda/etc/profile.d/conda.sh" ]; then
. "/opt/anaconda/etc/profile.d/conda.sh"
else
export PATH="/opt/anaconda/bin:$PATH"
fi
fi
unset __conda_setup
conda activate /apps/anaconda/jupyter/envs/jupyterhub_env
/apps/anaconda/jupyter/envs/jupyterhub_env/bin/jupyterhub -f
/etc/jupyterhub/jupyterhub_config.py 2>&1 | tee /var/log/jupyter/jupyterhub.log
JupyterHub
Create systemd JupyterHub service symbolic link
sudo ln -s /home/jupyter/.config/systemd/jupyterhub.service /etc/systemd/system/jupyterhub.service;
Enable/Start systemd JupyterHub service
sudo systemctl enable jupyterhub.service;
sudo systemctl start jupyterhub && systemctl status jupyterhub;
Note: it’s only necessary to change the highlighted
IPython Clusters
With this functionality it will enable on the current architecture, the ability to distribute your python processing between
local and/or remote cpu and therefore use the power of parallel processing.
Install ipyparallel
conda install ipyparallel;
Note: This package must be installed on the controller machine and on all remote engine nodes!
Apply to All Users
jupyter nbextension install --sys-prefix --py ipyparallel;
jupyter nbextension enable --sys-prefix --py ipyparallel;
jupyter serverextension enable --sys-prefix --py ipyparallel;
IPython Clusters
Create ssh profile on user
ipython profile create --parallel --profile=ssh;
Note: this is on the scope of the user that will run/spawn the notebook ex: tpsimoes
Configure ssh profile on user
nano /home/tpsimoes/.ipython/profile_ssh/ipcluster_config.py;
c.IPClusterStart.controller_launcher_class = 'Local'
c.IPClusterEngines.engine_launcher_class = 'SSH'
c.SSHEngineSetLauncher.engines = { 'cm1.localdomain' : 2, 'cm2.localdomain' : 5 }
nano /home/tpsimoes/.ipython/profile_ssh/ipcontroller_config.py;
c.IPControllerApp.location = 'cm1.localdomain'
c.HubFactory.client_ip = '10.111.22.333'
c.HubFactory.engine_ip = '10.111.22.333'
c.HubFactory.ip = '*'
Note: it’s only necessary to change the highlighted
So that IPython Cluster Controller (SSH profile) can communicate with all the engines (local and remote) we will need to
configure the SSH on Local machine and also on the remote nodes.
KeyLess Configuration
ssh-keygen;
Copy the SSH Public Key (id_rsa.pub) to the root account on your target hosts.
ssh-copy-id -i ~/.ssh/id_rsa.pub -p 22 tpsimoes@cm2.localdomain;
Add the SSH Public Key to the authorized_keys file on your target hosts.
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys && chmod 700 ~/.ssh && chmod 600 ~/.ssh/authorized_keys;
Add User to SSH
ssh tpsimoes@localhost;
ssh tpsimoes@cm1.localdomain;
ssh tpsimoes@cm2.localdomain;
Try connecting User via SSH
ssh -p '22' 'tpsimoes@cm2.localdomain';
Note: it’s only necessary to change the highlighted
IPython Clusters
IPython Clusters
When starting a Cluster via JupyterHub UI would should see on your logs the communication between machines…
JupyterHub Logs
[I 2021-02-22 14:28:43.979 SingleUserNotebookApp launcher:591] ensuring remote cm1.localdomain:.ipython/profile_ssh/security/ exists
Connection to cm1.localdomain closed.
[I 2021-02-22 14:28:44.776 SingleUserNotebookApp launcher:595] sending /home/tpsimoes/.ipython/profile_ssh/security/ipcontroller-client.json to
cm1.localdomain:.ipython/profile_ssh/security/ipcontroller-client.json
[I 2021-02-22 14:28:45.573 SingleUserNotebookApp launcher:591] ensuring remote cm1.localdomain:.ipython/profile_ssh/security/ exists
Connection to cm1.localdomain closed.
[I 2021-02-22 14:28:46.405 SingleUserNotebookApp launcher:595] sending /home/tpsimoes/.ipython/profile_ssh/security/ipcontroller-engine.json to
cm1.localdomain:.ipython/profile_ssh/security/ipcontroller-engine.json
[I 2021-02-22 14:28:47.308 SingleUserNotebookApp launcher:591] ensuring remote cm2.localdomain:.ipython/profile_ssh/security/ exists
Connection to cm2.localdomain closed.
[I 2021-02-22 14:28:48.087 SingleUserNotebookApp launcher:595] sending /home/tpsimoes/.ipython/profile_ssh/security/ipcontroller-client.json to
cm2.localdomain:.ipython/profile_ssh/security/ipcontroller-client.json
[I 2021-02-22 14:28:48.875 SingleUserNotebookApp launcher:591] ensuring remote cm2.localdomain:.ipython/profile_ssh/security/ exists
Connection to cm2.localdomain closed.
[I 2021-02-22 14:28:49.652 SingleUserNotebookApp launcher:595] sending /home/tpsimoes/.ipython/profile_ssh/security/ipcontroller-engine.json to
cm2.localdomain:.ipython/profile_ssh/security/ipcontroller-engine.json
Kerberos on JupyterHub
Install gcc Lib
sudo yum install -y gcc;
Create HTTP principal/keytab
sudo kadmin.local <<eoj
addprinc -randkey HTTP/cm1.localdomain@DOMAIN.COM
xst -norandkey -k HTTP.keytab HTTP/cm1.localdomain@DOMAIN.COM
eoj
Change Ownership and Permissions on Keytab
sudo mv HTTP.keytab /etc/jupyterhub/HTTP.keytab;
sudo chmod 440 /etc/jupyterhub/HTTP.keytab;
sudo chown jupyter:jupyterhub /etc/jupyterhub/HTTP.keytab;
Install gcc Lib
pip install jupyterhub-kerberosauthenticator;
Note: it’s only necessary to change the highlighted
Edit Final JupyterHub Configurations
nano /etc/jupyterhub/jupyterhub_config.py;
c.PAMAuthenticator.open_sessions = False
import os
import pwd
import subprocess
def create_dir_hook(spawner):
if not os.path.exists(os.path.join('/home/', spawner.user.name)):
subprocess.call(["sudo", "/sbin/mkhomedir_helper",
spawner.user.name])
c.Spawner.pre_spawn_hook = create_dir_hook
c.JupyterHub.bind_url = 'http://10.111.22.333:8000'
c.JupyterHub.hub_bind_url = 'http://10.111.22.333:8081'
c.JupyterHub.hub_ip = '10.111.22.333'
c.JupyterHub.spawner_class = 'sudospawner.SudoSpawner'
c.SudoSpawner.sudospawner_path =
'/apps/anaconda/jupyter/envs/jupyterhub_env/bin/sudospawner'
c.Authenticator.admin_users = {'jupyter'}
c.JupyterHub.authenticator_class = 'kerberosauthenticator.KerberosLocalAuthenticator'
Kerberos on JupyterHub
Something to be in attention… is that all users principals should be added headless!
Create User Principal
sudo kadmin.local <<eoj
addprinc -pw password tpsimoes@DOMAIN.COM
modprinc -maxrenewlife 7d +allow_renewable tpsimoes@DOMAIN.COM
eoj
JupyterHub Logs
[I 2021-02-19 18:01:18.993 JupyterHub app:2240] Running JupyterHub version 1.1.0
[I 2021-02-19 18:01:18.994 JupyterHub app:2270] Using Authenticator: kerberosauthenticator.auth.KerberosLocalAuthenticator-0.2.0
[I 2021-02-19 18:01:18.994 JupyterHub app:2270] Using Spawner: jupyterhub.spawner.LocalProcessSpawner-1.1.0
[I 2021-02-19 18:01:18.994 JupyterHub app:2270] Using Proxy: jupyterhub.proxy.ConfigurableHTTPProxy-1.1.0
[I 2021-02-19 18:01:18.999 JupyterHub app:1349] Loading cookie_secret from /root/jupyterhub_cookie_secret
[I 2021-02-19 18:01:19.033 JupyterHub proxy:461] Generating new CONFIGPROXY_AUTH_TOKEN
[...]
[I 2021-02-19 18:02:36.755 JupyterHub base:707] User logged in: tpsimoes
[I 2021-02-19 18:02:36.757 JupyterHub log:174] 302 GET /hub/kerberos_login -> /hub/spawn (@10.184.16.24) 11.75ms
[I 2021-02-19 18:02:36.837 JupyterHub spawner:1417] Spawning jupyterhub-singleuser --port=42504
[I 2021-02-19 18:02:39.082 SingleUserNotebookApp singleuser:561] Starting jupyterhub-singleuser server version 1.1.0
Thanks
Big Data Engineer
Tiago Simões

More Related Content

What's hot

An example Hadoop Install
An example Hadoop InstallAn example Hadoop Install
An example Hadoop Install
Mike Frampton
 
Hadoop installation
Hadoop installationHadoop installation
Hadoop installation
Ankit Desai
 
Running hadoop on ubuntu linux
Running hadoop on ubuntu linuxRunning hadoop on ubuntu linux
Running hadoop on ubuntu linux
TRCK
 
Set up Hadoop Cluster on Amazon EC2
Set up Hadoop Cluster on Amazon EC2Set up Hadoop Cluster on Amazon EC2
Set up Hadoop Cluster on Amazon EC2
IMC Institute
 
Hadoop on ec2
Hadoop on ec2Hadoop on ec2
Hadoop on ec2
Mark Kerzner
 
Hadoop single node setup
Hadoop single node setupHadoop single node setup
Hadoop single node setup
Mohammad_Tariq
 
Install and Configure Ubuntu for Hadoop Installation for beginners
Install and Configure Ubuntu for Hadoop Installation for beginners Install and Configure Ubuntu for Hadoop Installation for beginners
Install and Configure Ubuntu for Hadoop Installation for beginners
Shilpa Hemaraj
 
Automated infrastructure is on the menu
Automated infrastructure is on the menuAutomated infrastructure is on the menu
Automated infrastructure is on the menu
jtimberman
 
DevOps(4) : Ansible(2) - (MOSG)
DevOps(4) : Ansible(2) - (MOSG)DevOps(4) : Ansible(2) - (MOSG)
DevOps(4) : Ansible(2) - (MOSG)
Soshi Nemoto
 
Ansible 實戰:top down 觀點
Ansible 實戰:top down 觀點Ansible 實戰:top down 觀點
Ansible 實戰:top down 觀點
William Yeh
 
Automated Java Deployments With Rpm
Automated Java Deployments With RpmAutomated Java Deployments With Rpm
Automated Java Deployments With Rpm
Martin Jackson
 
DevOps(3) : Ansible - (MOSG)
DevOps(3) : Ansible - (MOSG)DevOps(3) : Ansible - (MOSG)
DevOps(3) : Ansible - (MOSG)
Soshi Nemoto
 
Vagrant, Ansible, and OpenStack on your laptop
Vagrant, Ansible, and OpenStack on your laptopVagrant, Ansible, and OpenStack on your laptop
Vagrant, Ansible, and OpenStack on your laptop
Lorin Hochstein
 
Making Your Capistrano Recipe Book
Making Your Capistrano Recipe BookMaking Your Capistrano Recipe Book
Making Your Capistrano Recipe Book
Tim Riley
 
From Dev to DevOps - Codemotion ES 2012
From Dev to DevOps - Codemotion ES 2012From Dev to DevOps - Codemotion ES 2012
From Dev to DevOps - Codemotion ES 2012
Carlos Sanchez
 
How to Develop Puppet Modules: From Source to the Forge With Zero Clicks
How to Develop Puppet Modules: From Source to the Forge With Zero ClicksHow to Develop Puppet Modules: From Source to the Forge With Zero Clicks
How to Develop Puppet Modules: From Source to the Forge With Zero Clicks
Carlos Sanchez
 
Hadoop 2.2.0 Multi-node cluster Installation on Ubuntu
Hadoop 2.2.0 Multi-node cluster Installation on Ubuntu Hadoop 2.2.0 Multi-node cluster Installation on Ubuntu
Hadoop 2.2.0 Multi-node cluster Installation on Ubuntu
康志強 大人
 
[오픈소스컨설팅] 프로메테우스 모니터링 살펴보고 구성하기
[오픈소스컨설팅] 프로메테우스 모니터링 살펴보고 구성하기[오픈소스컨설팅] 프로메테우스 모니터링 살펴보고 구성하기
[오픈소스컨설팅] 프로메테우스 모니터링 살펴보고 구성하기
Ji-Woong Choi
 
Preparation study of_docker - (MOSG)
Preparation study of_docker  - (MOSG)Preparation study of_docker  - (MOSG)
Preparation study of_docker - (MOSG)
Soshi Nemoto
 
Docker, c'est bonheur !
Docker, c'est bonheur !Docker, c'est bonheur !
Docker, c'est bonheur !
Alexandre Salomé
 

What's hot (20)

An example Hadoop Install
An example Hadoop InstallAn example Hadoop Install
An example Hadoop Install
 
Hadoop installation
Hadoop installationHadoop installation
Hadoop installation
 
Running hadoop on ubuntu linux
Running hadoop on ubuntu linuxRunning hadoop on ubuntu linux
Running hadoop on ubuntu linux
 
Set up Hadoop Cluster on Amazon EC2
Set up Hadoop Cluster on Amazon EC2Set up Hadoop Cluster on Amazon EC2
Set up Hadoop Cluster on Amazon EC2
 
Hadoop on ec2
Hadoop on ec2Hadoop on ec2
Hadoop on ec2
 
Hadoop single node setup
Hadoop single node setupHadoop single node setup
Hadoop single node setup
 
Install and Configure Ubuntu for Hadoop Installation for beginners
Install and Configure Ubuntu for Hadoop Installation for beginners Install and Configure Ubuntu for Hadoop Installation for beginners
Install and Configure Ubuntu for Hadoop Installation for beginners
 
Automated infrastructure is on the menu
Automated infrastructure is on the menuAutomated infrastructure is on the menu
Automated infrastructure is on the menu
 
DevOps(4) : Ansible(2) - (MOSG)
DevOps(4) : Ansible(2) - (MOSG)DevOps(4) : Ansible(2) - (MOSG)
DevOps(4) : Ansible(2) - (MOSG)
 
Ansible 實戰:top down 觀點
Ansible 實戰:top down 觀點Ansible 實戰:top down 觀點
Ansible 實戰:top down 觀點
 
Automated Java Deployments With Rpm
Automated Java Deployments With RpmAutomated Java Deployments With Rpm
Automated Java Deployments With Rpm
 
DevOps(3) : Ansible - (MOSG)
DevOps(3) : Ansible - (MOSG)DevOps(3) : Ansible - (MOSG)
DevOps(3) : Ansible - (MOSG)
 
Vagrant, Ansible, and OpenStack on your laptop
Vagrant, Ansible, and OpenStack on your laptopVagrant, Ansible, and OpenStack on your laptop
Vagrant, Ansible, and OpenStack on your laptop
 
Making Your Capistrano Recipe Book
Making Your Capistrano Recipe BookMaking Your Capistrano Recipe Book
Making Your Capistrano Recipe Book
 
From Dev to DevOps - Codemotion ES 2012
From Dev to DevOps - Codemotion ES 2012From Dev to DevOps - Codemotion ES 2012
From Dev to DevOps - Codemotion ES 2012
 
How to Develop Puppet Modules: From Source to the Forge With Zero Clicks
How to Develop Puppet Modules: From Source to the Forge With Zero ClicksHow to Develop Puppet Modules: From Source to the Forge With Zero Clicks
How to Develop Puppet Modules: From Source to the Forge With Zero Clicks
 
Hadoop 2.2.0 Multi-node cluster Installation on Ubuntu
Hadoop 2.2.0 Multi-node cluster Installation on Ubuntu Hadoop 2.2.0 Multi-node cluster Installation on Ubuntu
Hadoop 2.2.0 Multi-node cluster Installation on Ubuntu
 
[오픈소스컨설팅] 프로메테우스 모니터링 살펴보고 구성하기
[오픈소스컨설팅] 프로메테우스 모니터링 살펴보고 구성하기[오픈소스컨설팅] 프로메테우스 모니터링 살펴보고 구성하기
[오픈소스컨설팅] 프로메테우스 모니터링 살펴보고 구성하기
 
Preparation study of_docker - (MOSG)
Preparation study of_docker  - (MOSG)Preparation study of_docker  - (MOSG)
Preparation study of_docker - (MOSG)
 
Docker, c'est bonheur !
Docker, c'est bonheur !Docker, c'est bonheur !
Docker, c'est bonheur !
 

Similar to How to create a secured multi tenancy for clustered ML with JupyterHub

Install and configure linux
Install and configure linuxInstall and configure linux
Install and configure linux
Vicent Selfa
 
Provisioning with Puppet
Provisioning with PuppetProvisioning with Puppet
Provisioning with Puppet
Joe Ray
 
Puppet for Developers
Puppet for DevelopersPuppet for Developers
Puppet for Developers
sagarhere4u
 
Linux Containers From Scratch
Linux Containers From ScratchLinux Containers From Scratch
Linux Containers From Scratch
joshuasoundcloud
 
Virtualization and automation of library software/machines + Puppet
Virtualization and automation of library software/machines + PuppetVirtualization and automation of library software/machines + Puppet
Virtualization and automation of library software/machines + Puppet
Omar Reygaert
 
k8s practice 2023.pptx
k8s practice 2023.pptxk8s practice 2023.pptx
k8s practice 2023.pptx
wonyong hwang
 
Installaling Puppet Master and Agent
Installaling Puppet Master and AgentInstallaling Puppet Master and Agent
Installaling Puppet Master and Agent
Ranjit Avasarala
 
Puppet: Eclipsecon ALM 2013
Puppet: Eclipsecon ALM 2013Puppet: Eclipsecon ALM 2013
Puppet: Eclipsecon ALM 2013
grim_radical
 
Build Your Own CaaS (Container as a Service)
Build Your Own CaaS (Container as a Service)Build Your Own CaaS (Container as a Service)
Build Your Own CaaS (Container as a Service)
HungWei Chiu
 
Simple docker hosting in FIWARE Lab
Simple docker hosting in FIWARE LabSimple docker hosting in FIWARE Lab
Simple docker hosting in FIWARE Lab
Fernando Lopez Aguilar
 
Open erp on ubuntu
Open erp on ubuntuOpen erp on ubuntu
Open erp on ubuntu
Iker Coranti
 
One-Man Ops
One-Man OpsOne-Man Ops
One-Man Ops
Jos Boumans
 
Quick-Start Guide: Deploying Your Cloudian HyperStore Hybrid Storage Service
Quick-Start Guide: Deploying Your Cloudian HyperStore Hybrid Storage ServiceQuick-Start Guide: Deploying Your Cloudian HyperStore Hybrid Storage Service
Quick-Start Guide: Deploying Your Cloudian HyperStore Hybrid Storage Service
Cloudian
 
infra-as-code
infra-as-codeinfra-as-code
infra-as-code
Itamar Hassin
 
Beyond Golden Containers: Complementing Docker with Puppet
Beyond Golden Containers: Complementing Docker with PuppetBeyond Golden Containers: Complementing Docker with Puppet
Beyond Golden Containers: Complementing Docker with Puppet
lutter
 
kubernetes practice
kubernetes practicekubernetes practice
kubernetes practice
wonyong hwang
 
Puppi. Puppet strings to the shell
Puppi. Puppet strings to the shellPuppi. Puppet strings to the shell
Puppi. Puppet strings to the shell
Alessandro Franceschi
 
Installing odoo v8 from github
Installing odoo v8 from githubInstalling odoo v8 from github
Installing odoo v8 from github
Antony Gitomeh
 
AMS Node Meetup December presentation Phusion Passenger
AMS Node Meetup December presentation Phusion PassengerAMS Node Meetup December presentation Phusion Passenger
AMS Node Meetup December presentation Phusion Passenger
icemobile
 
Cloud init and cloud provisioning [openstack summit vancouver]
Cloud init and cloud provisioning [openstack summit vancouver]Cloud init and cloud provisioning [openstack summit vancouver]
Cloud init and cloud provisioning [openstack summit vancouver]
Joshua Harlow
 

Similar to How to create a secured multi tenancy for clustered ML with JupyterHub (20)

Install and configure linux
Install and configure linuxInstall and configure linux
Install and configure linux
 
Provisioning with Puppet
Provisioning with PuppetProvisioning with Puppet
Provisioning with Puppet
 
Puppet for Developers
Puppet for DevelopersPuppet for Developers
Puppet for Developers
 
Linux Containers From Scratch
Linux Containers From ScratchLinux Containers From Scratch
Linux Containers From Scratch
 
Virtualization and automation of library software/machines + Puppet
Virtualization and automation of library software/machines + PuppetVirtualization and automation of library software/machines + Puppet
Virtualization and automation of library software/machines + Puppet
 
k8s practice 2023.pptx
k8s practice 2023.pptxk8s practice 2023.pptx
k8s practice 2023.pptx
 
Installaling Puppet Master and Agent
Installaling Puppet Master and AgentInstallaling Puppet Master and Agent
Installaling Puppet Master and Agent
 
Puppet: Eclipsecon ALM 2013
Puppet: Eclipsecon ALM 2013Puppet: Eclipsecon ALM 2013
Puppet: Eclipsecon ALM 2013
 
Build Your Own CaaS (Container as a Service)
Build Your Own CaaS (Container as a Service)Build Your Own CaaS (Container as a Service)
Build Your Own CaaS (Container as a Service)
 
Simple docker hosting in FIWARE Lab
Simple docker hosting in FIWARE LabSimple docker hosting in FIWARE Lab
Simple docker hosting in FIWARE Lab
 
Open erp on ubuntu
Open erp on ubuntuOpen erp on ubuntu
Open erp on ubuntu
 
One-Man Ops
One-Man OpsOne-Man Ops
One-Man Ops
 
Quick-Start Guide: Deploying Your Cloudian HyperStore Hybrid Storage Service
Quick-Start Guide: Deploying Your Cloudian HyperStore Hybrid Storage ServiceQuick-Start Guide: Deploying Your Cloudian HyperStore Hybrid Storage Service
Quick-Start Guide: Deploying Your Cloudian HyperStore Hybrid Storage Service
 
infra-as-code
infra-as-codeinfra-as-code
infra-as-code
 
Beyond Golden Containers: Complementing Docker with Puppet
Beyond Golden Containers: Complementing Docker with PuppetBeyond Golden Containers: Complementing Docker with Puppet
Beyond Golden Containers: Complementing Docker with Puppet
 
kubernetes practice
kubernetes practicekubernetes practice
kubernetes practice
 
Puppi. Puppet strings to the shell
Puppi. Puppet strings to the shellPuppi. Puppet strings to the shell
Puppi. Puppet strings to the shell
 
Installing odoo v8 from github
Installing odoo v8 from githubInstalling odoo v8 from github
Installing odoo v8 from github
 
AMS Node Meetup December presentation Phusion Passenger
AMS Node Meetup December presentation Phusion PassengerAMS Node Meetup December presentation Phusion Passenger
AMS Node Meetup December presentation Phusion Passenger
 
Cloud init and cloud provisioning [openstack summit vancouver]
Cloud init and cloud provisioning [openstack summit vancouver]Cloud init and cloud provisioning [openstack summit vancouver]
Cloud init and cloud provisioning [openstack summit vancouver]
 

Recently uploaded

zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
Alex Pruden
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
Miro Wengner
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
Hiike
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
ScyllaDB
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
Jakub Marek
 
Public CyberSecurity Awareness Presentation 2024.pptx
Public CyberSecurity Awareness Presentation 2024.pptxPublic CyberSecurity Awareness Presentation 2024.pptx
Public CyberSecurity Awareness Presentation 2024.pptx
marufrahmanstratejm
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Tatiana Kojar
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
Postman
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
Dinusha Kumarasiri
 

Recently uploaded (20)

zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
 
Public CyberSecurity Awareness Presentation 2024.pptx
Public CyberSecurity Awareness Presentation 2024.pptxPublic CyberSecurity Awareness Presentation 2024.pptx
Public CyberSecurity Awareness Presentation 2024.pptx
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
 

How to create a secured multi tenancy for clustered ML with JupyterHub

  • 1. How-to create a secured multi tenancy for Clustered ML with JupyterHub Non-root + JupyterHub + Kerberos + IPython Cluster as a service
  • 2. Introduction With this presentation you should be able to create a kerberos secured architecture for a framework of an interactive data analysis and machine learning by using a Jupyter/JupyterHub powered by IPython Clusters that enables the processing clustering local and/or remote nodes.
  • 3. Architecture This architecture enables the following: ● Transparent data-science development ● User Authentication ● Authentication via Kerberos + SSH ● Upgrades on Cluster won’t affect the developments. ● Controlled access to the data and resources by Kerberos Tickets. ● Several coding API’s (Scala, R, Python, PySpark, etc…). ● Parallel Processing ● JupyterHub as service and non-root user
  • 5. Pre-Assumptions 1. Jupyter Machine hostname: cm1.localdomain 2. Controller Node hostname: cm1.localdomain Engine Node hostname: cm2.localdomain 3. Conda Python version: 3.8.5 4. Jupyter Machine Authentication Pre-Installed: Kerberos a. Kerberos Realm DOMAIN.COM 5. JupyterHub Machine Authentication Not-Installed: Kerberos 6. Permissions user with root or sudo 7. MIT Kerberos installed on your windows machine
  • 6. Miniconda Add Anaconda User/Dir adduser anaconda; passwd anaconda; mkdir /opt/anaconda; Download and installation wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -P /tmp; chmod +x /tmp/Miniconda3-latest-Linux-x86_64.sh; /tmp/Miniconda3-latest-Linux-x86_64.sh -b -u -p /opt/anaconda; Note 1: Change with your values in the highlighted field. Note 2: JupyterHub requires Python 3.X, therefore it will be installed Anaconda 3 Add Permissions MiniConda chown -R anaconda:anaconda /opt/anaconda; chmod -R go-w /opt/anaconda && chmod -R go+rX /opt/anaconda; mkdir -p /apps/anaconda/pkgs; chown -R anaconda:anaconda /apps/anaconda/pkgs && chmod -R oug+rwx /apps;
  • 7. Anaconda Set Conda Bash Configurations nano .bashrc; export CONDA_PKGS_DIRS="/apps/anaconda/pkgs","/opt/anaconda/pkgs","/home/$USER/.conda/pkgs" export CONDA_ENVS_DIRS="/apps/anaconda/$USER/envs" # >>> conda initialize >>> # !! Contents within this block are managed by 'conda init' !! __conda_setup="$('/opt/anaconda/bin/conda' 'shell.bash' 'hook' 2> /dev/null)" if [ $? -eq 0 ]; then eval "$__conda_setup" else if [ -f "/opt/anaconda/etc/profile.d/conda.sh" ]; then . "/opt/anaconda/etc/profile.d/conda.sh" else export PATH="/opt/anaconda/bin:$PATH" fi fi unset __conda_setup # <<< conda initialize <<< conda config --set auto_update_conda False && conda config --add channels conda-forge; conda config --set pip_interop_enabled True; Note: Change with your values in the highlighted field.
  • 8. Jupyter or JupyterHub? JupyterHub it’s a multi-purpose notebook that: ● Manages authentication. ● Spawns single-user notebook on-demand. ● Gives each user a complete notebook server. How to choose?
  • 9. JupyterHub JupyterHub needs to be executed with root privileges or at least some root privileges (ie for example to access to the pam passwords). Therefore we will need to configure a special user (with no password) that it will be used by the sudospawner! For this example we will set: user: jupyter | group: jupyterhub to execute the JupyterHub Server as a service. Any new user that should access to Jupyter and Spawn Notebooks … must be added to the JupyterHub group. Create User/Group to operate as Service sudo useradd jupyter && sudo groupadd jupyterhub && sudo usermod jupyter -G jupyterhub; Add jupyter to root group & Give Read Permissions (PAM) sudo usermod -a -G root jupyter; sudo chmod g+r /etc/shadow; Log as Jupyter user su - jupyter; Note 1: it’s only necessary to change the highlighted
  • 10. JupyterHub Set Conda Bash Configurations Use the configurations on the Page 7. Create Environment for JupyterHub conda create -n jupyterhub_env; Activate Environment for JupyterHub conda activate jupyterhub_env; Install JupyterHub Packages conda install jupyterhub jupyterlab notebook configurable-http-proxy; Install sudospawner Package conda install -c conda-forge sudospawner; Check sudospawner location which sudospawner; Note 1: it’s only necessary to change the highlighted Create JupyterHub Directories sudo mkdir /etc/jupyterhub; sudo chown jupyter:jupyterhub /etc/jupyterhub; Generate JupyterHub Config file cd /etc/jupyterhub && jupyterhub --generate-config;
  • 11. JupyterHub Create/Edit sudoers config sudo nano /etc/sudoers.d/jupytersudoers; Runas_Alias JUPYTER_USERS = jupyter Cmnd_Alias JUPYTER_CMD = /apps/anaconda/jupyter/envs/jupyterhub_env/bin/sudospawner %jupyterhub ALL=(jupyter) /usr/bin/sudo jupyter ALL=(%jupyterhub) NOPASSWD:JUPYTER_CMD Start JupyterHub Server With Config File jupyterhub -f /etc/jupyterhub/jupyterhub_config.py; Note: it’s only necessary to change the highlighted, ex: for your ip. Create/Edit sudoers config sudo nano /etc/sudoers.d/jupytersudoers; import os import pwd import subprocess def create_dir_hook(spawner): if not os.path.exists(os.path.join('/home/', spawner.user.name)): subprocess.call(["sudo", "/sbin/mkhomedir_helper", spawner.user.name]) c.Spawner.pre_spawn_hook = create_dir_hook c.JupyterHub.bind_url = 'http://10.111.22.333:8000' c.JupyterHub.hub_bind_url = 'http://10.111.22.333:8081' c.JupyterHub.hub_ip = '10.111.22.333’ c.JupyterHub.spawner_class = 'sudospawner.SudoSpawner' c.SudoSpawner.sudospawner_path = '/apps/anaconda/jupyter/envs/jupyterhub_env/bin/sudospawner' c.Authenticator.admin_users = {'jupyter'}
  • 12. JupyterHub Create systemd JupyterHub Directory sudo mkdir -p /home/jupyter/.config/systemd; Create systemd JupyterHub service Configuration sudo nano /home/jupyter/.config/systemd/jupyterhub.service; [Unit] Description=Jupyterhub Server After=syslog.target network-online.target [Service] Type=simple User=jupyter ExecStart=/etc/jupyterhub/runJupyterhub.sh WorkingDirectory=/etc/jupyterhub Restart=on-failure RestartSec=1min TimeoutSec=5min [Install] WantedBy=multi-user.target Note: it’s only necessary to change the highlighted Create JupyterHub Script for Systemd nano /etc/jupyterhub/runJupyterhub.sh; #!/bin/bash export CONDA_PKGS_DIRS="/apps/anaconda/pkgs","/opt/anaconda/pkgs","/home/$USER/. conda/pkgs" export CONDA_ENVS_DIRS="/apps/anaconda/$USER/envs" __conda_setup="$('/opt/anaconda/bin/conda' 'shell.bash' 'hook' 2> /dev/null)" if [ $? -eq 0 ]; then eval "$__conda_setup" else if [ -f "/opt/anaconda/etc/profile.d/conda.sh" ]; then . "/opt/anaconda/etc/profile.d/conda.sh" else export PATH="/opt/anaconda/bin:$PATH" fi fi unset __conda_setup conda activate /apps/anaconda/jupyter/envs/jupyterhub_env /apps/anaconda/jupyter/envs/jupyterhub_env/bin/jupyterhub -f /etc/jupyterhub/jupyterhub_config.py 2>&1 | tee /var/log/jupyter/jupyterhub.log
  • 13. JupyterHub Create systemd JupyterHub service symbolic link sudo ln -s /home/jupyter/.config/systemd/jupyterhub.service /etc/systemd/system/jupyterhub.service; Enable/Start systemd JupyterHub service sudo systemctl enable jupyterhub.service; sudo systemctl start jupyterhub && systemctl status jupyterhub; Note: it’s only necessary to change the highlighted
  • 14. IPython Clusters With this functionality it will enable on the current architecture, the ability to distribute your python processing between local and/or remote cpu and therefore use the power of parallel processing. Install ipyparallel conda install ipyparallel; Note: This package must be installed on the controller machine and on all remote engine nodes! Apply to All Users jupyter nbextension install --sys-prefix --py ipyparallel; jupyter nbextension enable --sys-prefix --py ipyparallel; jupyter serverextension enable --sys-prefix --py ipyparallel;
  • 15. IPython Clusters Create ssh profile on user ipython profile create --parallel --profile=ssh; Note: this is on the scope of the user that will run/spawn the notebook ex: tpsimoes Configure ssh profile on user nano /home/tpsimoes/.ipython/profile_ssh/ipcluster_config.py; c.IPClusterStart.controller_launcher_class = 'Local' c.IPClusterEngines.engine_launcher_class = 'SSH' c.SSHEngineSetLauncher.engines = { 'cm1.localdomain' : 2, 'cm2.localdomain' : 5 } nano /home/tpsimoes/.ipython/profile_ssh/ipcontroller_config.py; c.IPControllerApp.location = 'cm1.localdomain' c.HubFactory.client_ip = '10.111.22.333' c.HubFactory.engine_ip = '10.111.22.333' c.HubFactory.ip = '*' Note: it’s only necessary to change the highlighted
  • 16. So that IPython Cluster Controller (SSH profile) can communicate with all the engines (local and remote) we will need to configure the SSH on Local machine and also on the remote nodes. KeyLess Configuration ssh-keygen; Copy the SSH Public Key (id_rsa.pub) to the root account on your target hosts. ssh-copy-id -i ~/.ssh/id_rsa.pub -p 22 tpsimoes@cm2.localdomain; Add the SSH Public Key to the authorized_keys file on your target hosts. cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys && chmod 700 ~/.ssh && chmod 600 ~/.ssh/authorized_keys; Add User to SSH ssh tpsimoes@localhost; ssh tpsimoes@cm1.localdomain; ssh tpsimoes@cm2.localdomain; Try connecting User via SSH ssh -p '22' 'tpsimoes@cm2.localdomain'; Note: it’s only necessary to change the highlighted IPython Clusters
  • 17. IPython Clusters When starting a Cluster via JupyterHub UI would should see on your logs the communication between machines… JupyterHub Logs [I 2021-02-22 14:28:43.979 SingleUserNotebookApp launcher:591] ensuring remote cm1.localdomain:.ipython/profile_ssh/security/ exists Connection to cm1.localdomain closed. [I 2021-02-22 14:28:44.776 SingleUserNotebookApp launcher:595] sending /home/tpsimoes/.ipython/profile_ssh/security/ipcontroller-client.json to cm1.localdomain:.ipython/profile_ssh/security/ipcontroller-client.json [I 2021-02-22 14:28:45.573 SingleUserNotebookApp launcher:591] ensuring remote cm1.localdomain:.ipython/profile_ssh/security/ exists Connection to cm1.localdomain closed. [I 2021-02-22 14:28:46.405 SingleUserNotebookApp launcher:595] sending /home/tpsimoes/.ipython/profile_ssh/security/ipcontroller-engine.json to cm1.localdomain:.ipython/profile_ssh/security/ipcontroller-engine.json [I 2021-02-22 14:28:47.308 SingleUserNotebookApp launcher:591] ensuring remote cm2.localdomain:.ipython/profile_ssh/security/ exists Connection to cm2.localdomain closed. [I 2021-02-22 14:28:48.087 SingleUserNotebookApp launcher:595] sending /home/tpsimoes/.ipython/profile_ssh/security/ipcontroller-client.json to cm2.localdomain:.ipython/profile_ssh/security/ipcontroller-client.json [I 2021-02-22 14:28:48.875 SingleUserNotebookApp launcher:591] ensuring remote cm2.localdomain:.ipython/profile_ssh/security/ exists Connection to cm2.localdomain closed. [I 2021-02-22 14:28:49.652 SingleUserNotebookApp launcher:595] sending /home/tpsimoes/.ipython/profile_ssh/security/ipcontroller-engine.json to cm2.localdomain:.ipython/profile_ssh/security/ipcontroller-engine.json
  • 18. Kerberos on JupyterHub Install gcc Lib sudo yum install -y gcc; Create HTTP principal/keytab sudo kadmin.local <<eoj addprinc -randkey HTTP/cm1.localdomain@DOMAIN.COM xst -norandkey -k HTTP.keytab HTTP/cm1.localdomain@DOMAIN.COM eoj Change Ownership and Permissions on Keytab sudo mv HTTP.keytab /etc/jupyterhub/HTTP.keytab; sudo chmod 440 /etc/jupyterhub/HTTP.keytab; sudo chown jupyter:jupyterhub /etc/jupyterhub/HTTP.keytab; Install gcc Lib pip install jupyterhub-kerberosauthenticator; Note: it’s only necessary to change the highlighted Edit Final JupyterHub Configurations nano /etc/jupyterhub/jupyterhub_config.py; c.PAMAuthenticator.open_sessions = False import os import pwd import subprocess def create_dir_hook(spawner): if not os.path.exists(os.path.join('/home/', spawner.user.name)): subprocess.call(["sudo", "/sbin/mkhomedir_helper", spawner.user.name]) c.Spawner.pre_spawn_hook = create_dir_hook c.JupyterHub.bind_url = 'http://10.111.22.333:8000' c.JupyterHub.hub_bind_url = 'http://10.111.22.333:8081' c.JupyterHub.hub_ip = '10.111.22.333' c.JupyterHub.spawner_class = 'sudospawner.SudoSpawner' c.SudoSpawner.sudospawner_path = '/apps/anaconda/jupyter/envs/jupyterhub_env/bin/sudospawner' c.Authenticator.admin_users = {'jupyter'} c.JupyterHub.authenticator_class = 'kerberosauthenticator.KerberosLocalAuthenticator'
  • 19. Kerberos on JupyterHub Something to be in attention… is that all users principals should be added headless! Create User Principal sudo kadmin.local <<eoj addprinc -pw password tpsimoes@DOMAIN.COM modprinc -maxrenewlife 7d +allow_renewable tpsimoes@DOMAIN.COM eoj JupyterHub Logs [I 2021-02-19 18:01:18.993 JupyterHub app:2240] Running JupyterHub version 1.1.0 [I 2021-02-19 18:01:18.994 JupyterHub app:2270] Using Authenticator: kerberosauthenticator.auth.KerberosLocalAuthenticator-0.2.0 [I 2021-02-19 18:01:18.994 JupyterHub app:2270] Using Spawner: jupyterhub.spawner.LocalProcessSpawner-1.1.0 [I 2021-02-19 18:01:18.994 JupyterHub app:2270] Using Proxy: jupyterhub.proxy.ConfigurableHTTPProxy-1.1.0 [I 2021-02-19 18:01:18.999 JupyterHub app:1349] Loading cookie_secret from /root/jupyterhub_cookie_secret [I 2021-02-19 18:01:19.033 JupyterHub proxy:461] Generating new CONFIGPROXY_AUTH_TOKEN [...] [I 2021-02-19 18:02:36.755 JupyterHub base:707] User logged in: tpsimoes [I 2021-02-19 18:02:36.757 JupyterHub log:174] 302 GET /hub/kerberos_login -> /hub/spawn (@10.184.16.24) 11.75ms [I 2021-02-19 18:02:36.837 JupyterHub spawner:1417] Spawning jupyterhub-singleuser --port=42504 [I 2021-02-19 18:02:39.082 SingleUserNotebookApp singleuser:561] Starting jupyterhub-singleuser server version 1.1.0