Being woken up at 3 am by the pager is never fun but seeing an incident resolve before you’ve even left the bed is maddening. Sleepily the next day you tune the alert for a better night’s sleep yet more untuned alerts sing to you in your sleep. After a few rounds of alert-tuning whack-a-mole you wonder: Could I predict if an incident will resolve itself?
This is the story of how a weary engineer used a Cloud ML model with Cloud Functions to reduce pager noise. Recounting some of the challenges faced, we’ll explore training a model with a limited data set & continual training in a serverless environment. We’ll also explore the implications of using a bot as a first responder to a pager.
3. October 1st 2019
@mlfowler_
@Claranet
I Like to Think I Know Data
Source: https://peakcare.wordpress.com/2011/10/05/heads-in-the-sand/https://i.pinimg.com/originals/cb/32/5f/cb325f9c268bf2135125f512d95
14. October 1st 2019
@mlfowler_
@Claranet
Feature Engineering
• Worked examples starting from simple
number manipulations to complex
processes such as principal component
analysis (PCA)
• Lots of Python code and decent
explanations
• Primarily scikit-learn
• Decent bibliography per chapter
26. October 1st 2019
@mlfowler_
@Claranet
AI Platform
• Hosted Jupyter notebooks
• Distributable training with automatic resource
provisioning
• Supports CPUs, GPUs and TPUs
• Run across many nodes and multiple
experiments
• Automated hyperparameter tuning with
HyperTune
• Exportable models
• Model hosting for online prediction
27. October 1st 2019
@mlfowler_
@Claranet
Exporting a scikit-learn Model
from sklearn.ensemble import RandomForestClassifier
from sklearn.externals import joblib
model = RandomForestClassifier(n_estimators=n)
...
model.predict = model.predict_proba
joblib.dump(model, 'model.joblib')
38. October 1st 2019
@mlfowler_
@Claranet
Cloud Pub/Sub
• Publish/Subscribe messaging service
• At-least-once delivery
• Seek & Replay
- A subscription only sees from after it
was created