2. Demo: Predictive Modeling
• Train a predictive model using 699 biopsies
• The “label” of benign or malignant is known for each one
• Since we have labels, this is supervised learning
2
3. What if we don’t have labels?
• Can we get insight into our data if we don’t know the labels?
• Enter anomaly detection
• Since we don’t have labels, this is unsupervised learning
3
4. 10 lines are needed
to isolate this data point
(not anomalous)
5. Only 4 lines are needed
to isolate this data point
(highly anomalous)
6. Demo: Anomaly Detection
• Remove the labels of benign or malignant
• Train an anomaly detector on this unlabeled data
• Create a new dataset with the anomaly scores as “labels”
• Use these “labels” to train a predictive model!
6
9. Minority Report
• Anomaly detection works great on large unlabeled datasets,
especially if you expect to find an (adversarial) minority class
• Millions of credit card transactions, billions of network events …
• Doesn’t require you to know what you’re looking for!
9