Jupyter Notebooks for machine learning on Kubernetes & OpenShift | DevNation Tech Talk

Sophie Watson
sophie@redhat.com
@sophwats
Jupyter Notebooks for Machine learning
on Kubernetes and Openshift

Agenda
- ML workflow
- Kubernetes and OpenShift for Machine Learning
- Demo

codifying problem
and metrics
feature
engineering
model training
and tuning
model
validation
data collection
and cleaning
model
deployment
monitoring,
validation

feature
engineering
model training
and tuning
model
validation
model
deployment
monitoring,
validation
data collection
and cleaning
codifying
problem
and metrics
codifying
problem
and metrics

codifying
problem
and metrics
feature
engineering
model training
and tuning
model
validation
model
deployment
monitoring,
validation
data collection
and cleaning
codifying
problem
and metrics

feature
engineering
model training
and tuning
model
validation
model
deployment
monitoring,
validation
data collection
and cleaning
codifying
problem
and metrics
codifying
problem
and metrics
data
collection
and cleaning

feature
engineering
model training
and tuning
model
validation
model
deployment
monitoring,
validation
data collection
and cleaning
codifying problem
and metrics
codifying problem
and metrics
feature
engineering
data collection
and cleaning

codifying problem
and metrics
feature
engineering
data collection
and cleaning
feature
engineering
model training
and tuning
model
validation
model
deployment
monitoring,
validation
data collection
and cleaning
codifying problem
and metrics

data collection
and cleaning
feature
engineering
model training
and tuning
model
validation
model
deployment
monitoring,
validation
codifying problem
and metrics
codifying problem
and metrics
feature
engineering
model
training
and tuning
data collection
and cleaning
legit

data collection
and cleaning
feature
engineering
model training
and tuning
model
validation
model
deployment
monitoring,
validation
codifying problem
and metrics
data collection
and cleaning
codifying problem
and metrics
feature
engineering
model training
and tuning
model
validation

feature
engineering
model training
and tuning
model
validation
data collection
and cleaning
model
deployment
monitoring,
validation
codifying problem
and metrics
data collection
and cleaning
codifying problem
and metrics
feature
engineering
model training
and tuning
model
validation
model
deployment
monitoring,
validation

codifying problem
and metrics
feature
engineering
model training
and tuning
model
validation
data collection
and cleaning
model
deployment
monitoring,
validation
Robust, repeatable pipelines
Model monitoring and alerting
Scale up
Scale out
Self service environments
Reproducible environments
Multi-tenant environments
Access to specialised hardware
Scale out

base image
configuration and
installation recipes
user application code
979229b9
33721112
e8cae4f6
2bb6ab16
a8296f7e
a6afd91e
6b8cad3e

base image
configuration and
installation recipes
user application code
979229b9
33721112
e8cae4f6
2bb6ab16
a8296f7e
a6afd91e
6b8cad3e
model in production
on 16 July 2019

In [ ]:
In [ ]:
In [ ]:
import pandas as pd
data = pd.read_parquet("data.parquet")
data.size
from plot_tool import plot
plot_df = pd.DataFrame(data, columns=["x", "y"])
plot(plot_df)
new_data = pd.read_parquet("new_data.parquet")
data = data.append(pd.DataFrame(new_data))
So far we have loaded in the data and plotted it. Then we appended our new
data. Now we save it to file.
Exploring my data set
| Python 3
In [ ]: data.to_parquet("data2.parquet")

In [*]:
In [ ]:
import pandas as pd
data.size
plot(plot_df)
| Python 3
In [ ]: new_data = pd.read_parquet("new_data.parquet")

In [1]:
In [ ]:
Out [1]:
import pandas as pd
data.size
120000
plot(plot_df)
| Python 3

In [1]:
In [*]:
Out [1]:
import pandas as pd
data.size
120000
plot(plot_df)
| Python 3

In [1]:
In [2]:
Out [1]:
Out [2]:
import pandas as pd
data.size
120000
plot(plot_df)
| Python 3

In [1]:
In [2]:
Out [1]:
Out [2]:
import pandas as pd
data.size
120000
plot(plot_df)
| Python 3
In [3]: new_data = pd.read_parquet("new_data.parquet")

In [1]:
In [2]:
Out [1]:
Out [2]:
import pandas as pd
data.size
120000
plot(plot_df)
| Python 3
In [*]: data.to_parquet("data2.parquet")

In [1]:
In [2]:
Out [1]:
Out [2]:
import pandas as pd
data.size
120000
plot(plot_df)
| Python 3
In [4]: data.to_parquet("data2.parquet")

Identity and RBAC
Project Isolation
Sharing of resources
Resource Isolation
Resource Quotas
Priority

codifying problem
and metrics
feature
engineering
model training
and tuning
model
validation
data collection
and cleaning
model
deployment
monitoring,
validation
Robust, repeatable pipelines
Model monitoring and alerting
Self service environments
Reproducible environments
Multi-tenant environments
Access to specialised hardware

codifying problem
and metrics
feature
engineering
model training
and tuning
model
validation
data
collection and
cleaning
model
deployment
monitoring,
validation
OpenShift
Pipelines

codifying problem
and metrics
model
validation
data
collection and
cleaning
model
deployment
monitoring,
validation
2 3
OpenShift
Pipelines

REST endpoint
OpenShift
Serverless

Jupyter Notebooks for machine learning on Kubernetes & OpenShift | DevNation Tech Talk

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Jupyter Notebooks for machine learning on Kubernetes & OpenShift | DevNation Tech Talk

Similar to Jupyter Notebooks for machine learning on Kubernetes & OpenShift | DevNation Tech Talk (20)

More from Red Hat Developers

More from Red Hat Developers (20)

Recently uploaded

Recently uploaded (20)

Jupyter Notebooks for machine learning on Kubernetes & OpenShift | DevNation Tech Talk