Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
From training to explainability via git ops
1. From Training to Explainability via GitOps
Kubeflow Contributor Summit
October 2019
2. Outline
- Background: What Customers want from Kubeflow
- Time to value
- Governance
- How best to get to live predictions?
- GitOps - why and how
- Pipeline to serving walkthrough with
- Oversight
- Observability
- Explainability
3. What Customers want from an ML Platform
Empowerment/Time to value
● Self-service for data science
● DS & Ops collaboration
● Sandboxing
● Repeatable approaches
Governance
● Visibility and oversight of running models
● Detailed monitoring
● Audit trails
● Access control
● Repeatable approaches
● Explainability
Kubeflow ticks these boxes!
5. Kubeflow for Governance
- Metadata/Artifact Management
- Track what produced when and how
- Multi User Isolation
- Control who can do what
6. Path to Live Serving
Those features aimed at exploration and training
Multiple paths to serving (live predictions) with kubeflow.
How best to get from training to serving?
How do we get to serving with empowerment and governance?
7. GitOps for Live Serving
● Cluster state represented declaratively
● ArgoCD/Flux/Jenkins-X
● Audit trails and reverts
● Git permissions
● Favourite with Ops
Ok to push to cluster for sandboxing.
GitOps great option for prod… but how best to do it?
9. The scenario
● Classify income (as high or low) based on US Census features incl. age,
gender, race, marital status
● Train a scikit-learn classifier
● Deploy from kubeflow pipeline via GitOps
● Serve requests with Seldon
● Deploy alibi explainer and explain predictions
10. Build Model
- Model is income classifier
- Build alibi explainer together with model
# train an RF model
np.random.seed(0)
clf = RandomForestClassifier(n_estimators=50)
#clf.fit(preprocessor.transform(X_train), Y_train)
pipeline = Pipeline([('preprocessor', preprocessor),
('clf', clf)])
pipeline.fit(X_train, Y_train)
print(X_train.shape)
print(pipeline.predict(X_train[0:1]))
print("Creating an explainer")
predict_fn = lambda x: pipeline.predict_proba(x)
predict_fn(X_train[0:1])
predict_fn(np.zeros([1, len(feature_names)]))
explainer = alibi.explainers.AnchorTabular(predict_fn=predict_fn,
feature_names=feature_names,
categorical_names=category_map)
explainer.fit(X_train)
explainer.predict_fn = None # Clear explainer predict_fn as its a lambda and will be reset when loaded
with open("explainer.dill", 'wb') as f:
dill.dump(explainer,f)
11. Seldon GitOps Serving apiVersion: machinelearning.seldon.io/v1alpha2
kind: SeldonDeployment
metadata:
name: sklearn
spec:
name: iris
predictors:
- graph:
children: []
implementation: SKLEARN_SERVER
modelUri: gs://seldon-models/sklearn/iris
name: classifier
name: default
replicas: 1
Model in storage bucket
Manifest in Git
KFServing too
19. Sidenote: Access Control
Can’t have metrics without requests
Access from curl or Seldon UI predict/load-test
If you don’t have an existing auth preference we like...
22. Explainer Deployment
apiVersion: machinelearning.seldon.io/v1alpha2
kind: SeldonDeployment
metadata:
name: income
spec:
name: income
predictors:
- graph:
children: []
implementation: SKLEARN_SERVER
modelUri: gs://seldon-models/sklearn/income/model
name: classifier
explainer:
type: anchor_tabular
modelUri: gs://seldon-models/sklearn/income/explainer
name: default
replicas: 1
Declarative yaml
Wizards for time to value & sandboxing
23. Alibi Explainers
- Includes techniques for black-box models
- We’ll use anchors for tabular data
- Anchors are sufficient conditions to ensure a certain prediction
- As long as the anchor holds, the prediction should remain the same
regardless of the values of the other features
- Anchors are chosen to maximise the range for which the prediction holds
26. Wrap-up
● What Seldon Customers want
○ Time to value
○ Governance
● GitOps helps with both
● Pipeline to serving walkthrough with
○ Oversight
○ Observability
○ Explainability
27. The Future
Very excited about:
● Metadata integrations
● Permissions
● KFServing and MLGraph