Uncertainty in deep learning sandaysky.pptx

Inbar Naor / Data Scientist, Taboola April 2018
Don't believe everything your
network tells you
Uncertainty in deep learning

Outline
• Why do we need uncertainty?
• Uncertainty Types
• Uncertainty in Recommendation Systems
• Bayesian Learning 101
• Data Uncertainty
• Model Uncertainty
DON'T BELIEVE EVERYTHING YOUR NETWORK TELLS YOU

Why Do We Need Uncertainty?
● High risk applications
● Out of data samples
● Improving the model:
○ mistakes with high confidence
○ true labels with low confidence

Capturing Model Blind spots is Crucial
UNCERTAINTY IN DEEP LEARNING

Model Uncertainty
More Data Please!

Data Uncertainty
More Data Won’t Help – I want to
know how good my predictions are

Uncertainty in
Recommendation
Systems

Content Discovery
A DEEP LEARNING APPROACH TO DISCOVERY
Context
Metadata
User Historical Data
Rank N
recommendations by
CTR * CPC
~1M Possible
Recommendations Region-based
Location
Information

Machine Learning + Discovery = Hard
Rank N
recommendations by
CTR * CPC
Implicit
Feedback
Data is Very
Sparse
No Human
Baseline
World is
Ever-changing
“Walmart cameras captured these awesome photos”“Walmart cameras captured these hilarious photos”
“15 rarely seen WW2 Photos Discovered”

Why Go Deep for Discovery?
• Cold start is a huge issue
• Many hard sub problems
– Language modeling
– Image classification
– User Profiling
• There are many complex relations

Exploration/Exploitation in Recommender Systems
Best Performing
Recommendations
Add new Information
Fight selection biasSearch for new stars
Exploration at random doesn’t work

Bayesian Learning
● Probability is a measure of belief
● Prior distribution represents
our belief about the world
● Likelihood
● Posterior
Introduction

Capturing Data Uncertainty
Know what you don’t know

Likelihood as loss
CAPTURING DATA UNCERTAINTY
h1
h2
… hn
LSTM LSTM LSTM LSTM
WordWordWord …
y

Likelihood as loss
(*) In classification we can
assume a binomial
distribution with Beta prior
h1
h2
… hn
LSTM LSTM LSTM LSTM
WordWordWord …
Mu Std

Mixture Density Network
Christopher M. Bishop, Mixture density networks (1994)

Capturing Data Uncertainty

Data Uncertainty and Training Error

Data Uncertainty and OOV
h1
h2
… hn
LSTM LSTM LSTM LSTM
Word
OOV
Word1
Word …
Mu Std
OOV
Word2
Probably
higher
variance

Data Uncertainty and OOV

Neural Network - A Probabilistic Perspective
Introduction
Neural Network is a probabilistic model
We learn W using MLE:
Adding regularization using MAP:

Bayesian Neural Networks
Model Uncertainty
Weights uncertainty
Prior:
Posterior:
Prediction:
is often intractable

Bayesian Posterior Inference
Model Uncertainty
Variational Inference Sampling Methods
High bias - low variance High variance - low bias

Dropouts – a regularization technique
CAPTURING MODEL UNCERTAINTY

Dropout variational inference as Bayesian Approximation
• Monte-Carlo Dropouts as Bayesian Approximation
• A little bit like bagging
gal & ghahramani, Bayesian convolutional neural networks with bernoulli approximate variational inference (2016)

Uncertainty as a function of amount of data
Image: Yarin Gal, “what my deep model doesn’t know”

Uncertainty as a function of amount of data

Uncertainty in exploration
● Upper Confidence Bound - adding sigma to the score
● Thompson Sampling - sample from the distribution

Summary
• Two types of uncertainty: model and data
• Mixture Density Networks
• Captures the true variance of your prediction
• Monte-Carlo Dropouts Variational Inference
• Sheds light on where the model lacks data
• Why not both?
• Also interesting: uncertainty due to measurement noise
DON'T BELIEVE EVERYTHING YOUR NETWORK TELLS YOU
Model Uncertainty Data Uncertainty

Thank You
Feel free to contact with questions!
Inbar Naor - inbar.naor1@gmail.com
Taboola Engineering Blog
Unsupervised - A Podcast about Data Science in Israel

Bibliography
● What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? Kendall & Gal, 2017
● Dropout Inference In Bayesian Neural Networks with Alpha-Divergence. Li & Gal, ICML 2017
● On Modern Deep Learning and Variational Inference. Gal & Ghahramani, Advances in Approximate Bayesian Inference
workshop, NIPS 2015
● Bayesian Convolutional Neural Networks with Bernoulli Approximate Variational Inference. Gal & Ghahramani, ICLR 2016
● Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning. Gal & Ghahramani, ICML 2016
● Dropout as a Bayesian Approximation: Insights and Applications. Gal & Ghahramani, Deep Learning Workshop, ICML
2015
● Weight Uncertainty in Neural Networks . Blundell, Cornebise, Kavakcuoglu, Wierstra 2015
● Training Deep Learning Neural Networks Based on Unreliable Labels. Bekker & Goldberger, ICASSP 2016
● Keeping Neural Networks Simple by Minimizing the description length of weights. Hinton & van Camp, Proceedings of
COLT- 1993
● Practical Variational Inference for Neural Networks. Graves, NIPS 2011
● Learning Weight Uncertainty with Stochastic Gradient MCMC for Shape Classification. Chunyuan et al. CVPR 2016

Uncertainty in deep learning sandaysky.pptx

Recommended

Recommended

More Related Content

Similar to Uncertainty in deep learning sandaysky.pptx

Similar to Uncertainty in deep learning sandaysky.pptx (20)

Recently uploaded

Recently uploaded (20)

Uncertainty in deep learning sandaysky.pptx