SlideShare a Scribd company logo
Review of Filos et al. (2019):
“A Systematic Comparison of Bayesian Deep
Learning Robustness in Diabetic Retinopathy”
NeurIPSよみかい@Ridge-i
January 31st 2020
Aaron C. Bell, Engineer
• Paper PDF:
• http://bayesiandeeplearning.org/2019/paper
s/12.pdf
The context:A major problem with DL
...and a major bottleneck in BDL
● UCI “toy” datasets limit research in BDL:
...and a major bottleneck in BDL
● UCI “toy” datasets limit research in BDL:
● Yachts, Wine, Concrete, Energy….
...and a major bottleneck in BDL
● UCI “toy” datasets...
● Yachts, Wine, Concrete, Energy….
● UCI “toy” datasets… are they too easy?
● Yachts, Wine, Concrete, Energy….
The context:A major problem with DL BDL?
What is Bayesian deep learning?
● Extension of Bayesian methods to deep learning
○ Taking account of prior information to
○ Getting robust uncertainties on predictions
● Allows us to ask:
○ How powerful are your results… really?
○ Is higher accuracy really a significant result?
What is Bayesian deep learning?
● Extension of Bayesian methods to neural networks
What is Bayesian deep learning?
● Extension of Bayesian methods to deep learning
○ Taking account of prior information
○ Getting robust uncertainties on predictions
● Allows us to ask:
○ How powerful are your results… really?
○ Is higher accuracy really a significant result?
What is Bayesian deep learning?
● Extension of Bayesian methods to deep learning
○ Taking account of prior information to
○ Getting robust uncertainties on predictions
● Allows us to ask:
○ How powerful are your results… really?
○ Is higher accuracy really a significant result?
What is Bayesian deep learning?
● Extension of Bayesian methods to neural networks
○ Allows DL to be applied in real-world applications where uncertainties are critical
What is Bayesian deep learning?
● Extension of Bayesian methods to deep learning
○ Taking account of prior information to
○ Getting robust uncertainties on predictions
● Allows us to ask:
○ How powerful are your results… really?
○ Is higher accuracy really a significant result?
What is Bayesian deep learning?
● Extension of Bayesian methods to deep learning
○ Taking account of prior information to
○ Getting robust uncertainties on predictions
● Allows us to ask:
○ How powerful are your results… really?
○ Is higher accuracy really a significant result?
What is Bayesian deep learning?
● Extension of Bayesian methods to deep learning
○ Taking account of prior information to
○ Getting robust uncertainties on predictions
● Allows us to ask:
○ How powerful are your results… really?
○ Is higher accuracy really a significant result?
What is Bayesian deep learning?
● Extension of Bayesian methods to deep learning
○ Taking account of prior information to
○ Getting robust uncertainties on predictions
● Allows us to ask:
○ How powerful are your results… really?
○ Is higher accuracy really a significant result?
■ Opens door to DL use for scientific hypothesis testing
The paper’s objectives:
1) Widen the bottleneck in BDL --- provide a better benchmark dataset (than
UCI)
1) Show off the strong points of BDL --- argue a specific, challenging, real-world
example where BDL is needed, medical diagnosis.
The paper’s objectives:
1) Widen the bottleneck in BDL --- provide a better benchmark dataset (than
UCI)
1) Show off the strong points of BDL --- argue a specific, challenging, real-world
example where BDL is needed, medical diagnosis.
The paper’s objectives:
1) Widen the bottleneck in BDL --- provide a better benchmark dataset (than
UCI)
A better benchmark dataset for BDL
● Step 1: Choose an existing dataset that’s suited for BDL’s strengths:
○ 1) Highly dimensional
○ 2) Large dataset
○ 3) Requiring more complex models
● Step 2: Enhance suitability for BDL benchmarking
○ 1) Pre-process the dataset.
○ 2) Develop API for benchmarking.
A better benchmark dataset for BDL
● Step 1: Choose an existing dataset that’s suited for BDL’s strengths:
○ 1) Highly dimensional
○ 2) Large dataset
○ 3) Requiring more complex models
● Step 2: Enhance suitability for BDL benchmarking
○ 1) Pre-process the dataset.
○ 2) Develop API for benchmarking.
A better benchmark dataset for BDL
● Step 1: Choose an existing dataset that’s suited for BDL’s strengths:
○ 1) Highly dimensional
○ 2) Large dataset
○ 3) Requiring more complex models
● Step 2: Enhance suitability for BDL benchmarking
○ 1) Pre-process the dataset.
○ 2) Develop API for benchmarking.
A better benchmark dataset for BDL
● Step 1: Choose an existing dataset that’s suited for BDL’s strengths:
○ 1) Highly dimensional
○ 2) Large number of examples
○ 3) Requiring more complex models
● Step 2: Enhance suitability for BDL benchmarking
○ 1) Pre-process the dataset.
○ 2) Develop API for benchmarking.
A better benchmark dataset for BDL
● Step 1: Choose an existing dataset that’s suited for BDL’s strengths:
○ 1) Highly dimensional
○ 2) Large number of examples
○ 3) Requiring more complex models
● Step 2: Enhance suitability for BDL benchmarking
○ 1) Pre-process the dataset.
○ 2) Develop API for benchmarking.
A better benchmark dataset for BDL
● Step 1:
Choose an existing highly
dimensional, large
dataset….
Diabetic retinopathy
“fundus” images
(Kaggle dataset)
A better benchmark dataset for BDL
● Step 1: Choose an existing highly dimensional, large dataset….
Diabetic retinopathy (DR) “fundus” images (Kaggle dataset)
A better benchmark dataset for BDL
● Step 2: Pre-process the dataset:
○ Redefine the problem… 5-classes of diabetic retinopathy (DR) to Binary
0: No DR
1: Mild DR
2: Moderate DR
3: Severe DR
4. Critical DR
0: Sight not in
danger
1: Sight in danger
A better benchmark dataset for BDL
● Step 2: Pre-process the dataset:
○ Augment data: Make it challenging enough for BDL.
Objective 2)Show an example where BDL is needed
● Giving predictions with uncertainties
● Informing medical diagnosis
● Streamlining patient referrals
Objective 2) Show an example where BDL is needed
● Giving predictions with uncertainties
● Informing medical diagnosis
● Streamlining patient referrals
Automatic Final
Diagnosis
Objective 2) Show an example where BDL is needed
● Giving predictions with uncertainties
● Informing medical diagnosis
● Streamlining patient referrals
Automatic Final
Diagnosis
Referral to
“real” doctor
A better benchmark dataset for BDL
Method: Compare four BDL techniques
● Bayesian Neural Networks:
○ 1) Mean-field variational inference (MFVI)
○ 2) Monte Carlo Dropout (MC Dropout)
Four methods to compare..
● Bayesian Neural Networks:
○ 1) Mean-field variational inference (MFVI)
○ 2) Monte Carlo Dropout (MC Dropout)
Four methods to compare..
● Bayesian Neural Networks:
○ 1) Mean-field variational inference (MFVI)
○ 2) Monte Carlo Dropout (MC Dropout)
● 3) Model Ensembling --- “Deep Ensemble”
Four methods to compare..
● Bayesian Neural Networks:
○ 1) Mean-field variational inference (MFVI)
○ 2) Monte Carlo Dropout (MC Dropout)
● 3) Model Ensembling -- “Deep Ensemble”
● 4) Combine (2) and (3) -- “Ensemble MC Dropout”
Four methods to compare..
● Bayesian Neural Networks:
○ 1) Mean-field variational inference (MFVI)
○ 2) Monte Carlo Dropout (MC Dropout)
● 3) Model Ensembling
● 4) Combine (2) and (3)
● 5*) Deterministic baseline
Bayesian Neural Networks
● 1) Mean-field Variational Inference
● 2) Monte Carlo Dropout
Bayesian Neural Networks
● 1) Mean-field Variational Inference
Bayesian Neural Networks
● 1) Mean-field Variational Inference
● 2) Monte-Carlo Dropout
3) Model Ensembling
● No special training or inference techniques.
3) Model Ensembling
● No special training or inference techniques.
3) Model Ensembling
● Just train a bunch of models in parallel, with different ICs
3) Model Ensembling
● Can be combined with MC Dropout
4) Ensemble MC Dropout
● An ensemble of MC dropout networks
MC simulation
dropout
applied during
test time
Naive Baselines
● Deterministic
● Random
The state of the art...
Is MFVI really the best BDL technique?
The state of the art...
Is MFVI really the best BDL technique?
The state of the art… SPOILER WARNING
Is MFVI really the best BDL technique?
The state of the art...
Is MFVI really the best BDL technique?
● UCI (easy) benchmarks: “Yes”
The state of the art...
Is MFVI really the best BDL technique?
● UCI (easy) benchmarks: “Yes”
● This paper (hard) benchmark: “No”
Comparison of Various Approaches: Data retention
Comparison of Various Approaches: Data retention
In-domain
(Kaggle DR)
Comparison of Various Approaches: Data retention
In-domain
(Kaggle DR)
Out-of-domain
(India blindness
detection dataset)
Comparison of Various Approaches: Data retention
In-domain
(Kaggle DR)
Out-of-domain
(India blindness
detection dataset)
All models
converge on full
dataset… (within
std error bar)
Uncertainty
comparison is
fair.
Comparison of Various Approaches: Data retention
Ensemble MC Dropout
Always Performs best at
50% data retention
Comparison of Various Approaches: Data retention
Major conclusions...
● Over use of UCI may have misled the BDL community.
Major conclusions...
● Over use of UCI may have misled the BDL community.
● Harder benchmarks give a better picture of BDL method performance
Major conclusions...
● Over use of UCI may have misled the BDL community.
● Harder benchmarks give a better picture of BDL method performance
● BDL methods are suited for cases where uncertainty is critical for the
downstream decision task… (medical diagnosis, re-evaluation.

More Related Content

Similar to Neur ips yomikai_at_ridgei_aaron_jan312020

Deep Conditional Adversarial learning for polyp Segmentation
Deep Conditional Adversarial learning for polyp SegmentationDeep Conditional Adversarial learning for polyp Segmentation
Deep Conditional Adversarial learning for polyp Segmentation
multimediaeval
 
KDD22_tutorial_slides_final_sharing.pptx
KDD22_tutorial_slides_final_sharing.pptxKDD22_tutorial_slides_final_sharing.pptx
KDD22_tutorial_slides_final_sharing.pptx
mattmcknight4
 
2015 aem-grs-keynote
2015 aem-grs-keynote2015 aem-grs-keynote
2015 aem-grs-keynote
c.titus.brown
 
Automatically Detecting Scientific Misinformation
Automatically Detecting Scientific MisinformationAutomatically Detecting Scientific Misinformation
Automatically Detecting Scientific Misinformation
Isabelle Augenstein
 
Oxford Lectures Part 1
Oxford Lectures Part 1Oxford Lectures Part 1
Oxford Lectures Part 1Andrea Pasqua
 
BD2K Update
BD2K Update BD2K Update
BD2K Update
Philip Bourne
 
Paris Data Ladies #14
Paris Data Ladies #14Paris Data Ladies #14
Paris Data Ladies #14
Nina Bertrand
 
Jillian ms defense-4-14-14-ja
Jillian ms defense-4-14-14-jaJillian ms defense-4-14-14-ja
Jillian ms defense-4-14-14-jaJillian Aurisano
 
Solving the Database Problem
Solving the Database ProblemSolving the Database Problem
Solving the Database Problem
Jay Gordon
 
Comparing Incremental Learning Strategies for Convolutional Neural Networks
Comparing Incremental Learning Strategies for Convolutional Neural NetworksComparing Incremental Learning Strategies for Convolutional Neural Networks
Comparing Incremental Learning Strategies for Convolutional Neural Networks
Vincenzo Lomonaco
 
2023-08-22 CoLLAs Tutorial - Beyond CIL.pdf
2023-08-22 CoLLAs Tutorial - Beyond CIL.pdf2023-08-22 CoLLAs Tutorial - Beyond CIL.pdf
2023-08-22 CoLLAs Tutorial - Beyond CIL.pdf
Vincenzo Lomonaco
 
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Databricks
 
Generalization Ability of MOS Prediction Networks
Generalization Ability of MOS Prediction NetworksGeneralization Ability of MOS Prediction Networks
Generalization Ability of MOS Prediction Networks
Yamagishi Laboratory, National Institute of Informatics, Japan
 
Presentation1.pptx
Presentation1.pptxPresentation1.pptx
Presentation1.pptx
MelkamuGebeyehu1
 
Layer-wise CNN Surgery for Visual Sentiment Prediction
Layer-wise CNN Surgery for Visual Sentiment PredictionLayer-wise CNN Surgery for Visual Sentiment Prediction
Layer-wise CNN Surgery for Visual Sentiment Prediction
Universitat Politècnica de Catalunya
 
MONAI and Open Science for Medical Imaging Deep Learning: SIPAIM 2020
MONAI and Open Science for Medical Imaging Deep Learning: SIPAIM 2020MONAI and Open Science for Medical Imaging Deep Learning: SIPAIM 2020
MONAI and Open Science for Medical Imaging Deep Learning: SIPAIM 2020
Stephen Aylward
 
Usage of Generative Adversarial Networks (GANs) in Healthcare
Usage of Generative Adversarial Networks (GANs) in HealthcareUsage of Generative Adversarial Networks (GANs) in Healthcare
Usage of Generative Adversarial Networks (GANs) in Healthcare
GlobalLogic Ukraine
 
351315535-Module-1-Intro-to-Data-Science-pptx.pptx
351315535-Module-1-Intro-to-Data-Science-pptx.pptx351315535-Module-1-Intro-to-Data-Science-pptx.pptx
351315535-Module-1-Intro-to-Data-Science-pptx.pptx
XanGwaps
 
Week_2_Lecture.pdf
Week_2_Lecture.pdfWeek_2_Lecture.pdf
Week_2_Lecture.pdf
AlbertoLugoGonzalez
 
Interpreting Complex Real World Data for Pharmaceutical Research
Interpreting Complex Real World Data for Pharmaceutical ResearchInterpreting Complex Real World Data for Pharmaceutical Research
Interpreting Complex Real World Data for Pharmaceutical Research
Paul Agapow
 

Similar to Neur ips yomikai_at_ridgei_aaron_jan312020 (20)

Deep Conditional Adversarial learning for polyp Segmentation
Deep Conditional Adversarial learning for polyp SegmentationDeep Conditional Adversarial learning for polyp Segmentation
Deep Conditional Adversarial learning for polyp Segmentation
 
KDD22_tutorial_slides_final_sharing.pptx
KDD22_tutorial_slides_final_sharing.pptxKDD22_tutorial_slides_final_sharing.pptx
KDD22_tutorial_slides_final_sharing.pptx
 
2015 aem-grs-keynote
2015 aem-grs-keynote2015 aem-grs-keynote
2015 aem-grs-keynote
 
Automatically Detecting Scientific Misinformation
Automatically Detecting Scientific MisinformationAutomatically Detecting Scientific Misinformation
Automatically Detecting Scientific Misinformation
 
Oxford Lectures Part 1
Oxford Lectures Part 1Oxford Lectures Part 1
Oxford Lectures Part 1
 
BD2K Update
BD2K Update BD2K Update
BD2K Update
 
Paris Data Ladies #14
Paris Data Ladies #14Paris Data Ladies #14
Paris Data Ladies #14
 
Jillian ms defense-4-14-14-ja
Jillian ms defense-4-14-14-jaJillian ms defense-4-14-14-ja
Jillian ms defense-4-14-14-ja
 
Solving the Database Problem
Solving the Database ProblemSolving the Database Problem
Solving the Database Problem
 
Comparing Incremental Learning Strategies for Convolutional Neural Networks
Comparing Incremental Learning Strategies for Convolutional Neural NetworksComparing Incremental Learning Strategies for Convolutional Neural Networks
Comparing Incremental Learning Strategies for Convolutional Neural Networks
 
2023-08-22 CoLLAs Tutorial - Beyond CIL.pdf
2023-08-22 CoLLAs Tutorial - Beyond CIL.pdf2023-08-22 CoLLAs Tutorial - Beyond CIL.pdf
2023-08-22 CoLLAs Tutorial - Beyond CIL.pdf
 
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
 
Generalization Ability of MOS Prediction Networks
Generalization Ability of MOS Prediction NetworksGeneralization Ability of MOS Prediction Networks
Generalization Ability of MOS Prediction Networks
 
Presentation1.pptx
Presentation1.pptxPresentation1.pptx
Presentation1.pptx
 
Layer-wise CNN Surgery for Visual Sentiment Prediction
Layer-wise CNN Surgery for Visual Sentiment PredictionLayer-wise CNN Surgery for Visual Sentiment Prediction
Layer-wise CNN Surgery for Visual Sentiment Prediction
 
MONAI and Open Science for Medical Imaging Deep Learning: SIPAIM 2020
MONAI and Open Science for Medical Imaging Deep Learning: SIPAIM 2020MONAI and Open Science for Medical Imaging Deep Learning: SIPAIM 2020
MONAI and Open Science for Medical Imaging Deep Learning: SIPAIM 2020
 
Usage of Generative Adversarial Networks (GANs) in Healthcare
Usage of Generative Adversarial Networks (GANs) in HealthcareUsage of Generative Adversarial Networks (GANs) in Healthcare
Usage of Generative Adversarial Networks (GANs) in Healthcare
 
351315535-Module-1-Intro-to-Data-Science-pptx.pptx
351315535-Module-1-Intro-to-Data-Science-pptx.pptx351315535-Module-1-Intro-to-Data-Science-pptx.pptx
351315535-Module-1-Intro-to-Data-Science-pptx.pptx
 
Week_2_Lecture.pdf
Week_2_Lecture.pdfWeek_2_Lecture.pdf
Week_2_Lecture.pdf
 
Interpreting Complex Real World Data for Pharmaceutical Research
Interpreting Complex Real World Data for Pharmaceutical ResearchInterpreting Complex Real World Data for Pharmaceutical Research
Interpreting Complex Real World Data for Pharmaceutical Research
 

Recently uploaded

FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 

Recently uploaded (20)

FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 

Neur ips yomikai_at_ridgei_aaron_jan312020

  • 1. Review of Filos et al. (2019): “A Systematic Comparison of Bayesian Deep Learning Robustness in Diabetic Retinopathy” NeurIPSよみかい@Ridge-i January 31st 2020 Aaron C. Bell, Engineer
  • 2.
  • 3. • Paper PDF: • http://bayesiandeeplearning.org/2019/paper s/12.pdf
  • 4. The context:A major problem with DL
  • 5. ...and a major bottleneck in BDL ● UCI “toy” datasets limit research in BDL:
  • 6. ...and a major bottleneck in BDL ● UCI “toy” datasets limit research in BDL: ● Yachts, Wine, Concrete, Energy….
  • 7. ...and a major bottleneck in BDL ● UCI “toy” datasets... ● Yachts, Wine, Concrete, Energy….
  • 8. ● UCI “toy” datasets… are they too easy? ● Yachts, Wine, Concrete, Energy…. The context:A major problem with DL BDL?
  • 9. What is Bayesian deep learning? ● Extension of Bayesian methods to deep learning ○ Taking account of prior information to ○ Getting robust uncertainties on predictions ● Allows us to ask: ○ How powerful are your results… really? ○ Is higher accuracy really a significant result?
  • 10. What is Bayesian deep learning? ● Extension of Bayesian methods to neural networks
  • 11. What is Bayesian deep learning? ● Extension of Bayesian methods to deep learning ○ Taking account of prior information ○ Getting robust uncertainties on predictions ● Allows us to ask: ○ How powerful are your results… really? ○ Is higher accuracy really a significant result?
  • 12. What is Bayesian deep learning? ● Extension of Bayesian methods to deep learning ○ Taking account of prior information to ○ Getting robust uncertainties on predictions ● Allows us to ask: ○ How powerful are your results… really? ○ Is higher accuracy really a significant result?
  • 13. What is Bayesian deep learning? ● Extension of Bayesian methods to neural networks ○ Allows DL to be applied in real-world applications where uncertainties are critical
  • 14. What is Bayesian deep learning? ● Extension of Bayesian methods to deep learning ○ Taking account of prior information to ○ Getting robust uncertainties on predictions ● Allows us to ask: ○ How powerful are your results… really? ○ Is higher accuracy really a significant result?
  • 15. What is Bayesian deep learning? ● Extension of Bayesian methods to deep learning ○ Taking account of prior information to ○ Getting robust uncertainties on predictions ● Allows us to ask: ○ How powerful are your results… really? ○ Is higher accuracy really a significant result?
  • 16. What is Bayesian deep learning? ● Extension of Bayesian methods to deep learning ○ Taking account of prior information to ○ Getting robust uncertainties on predictions ● Allows us to ask: ○ How powerful are your results… really? ○ Is higher accuracy really a significant result?
  • 17. What is Bayesian deep learning? ● Extension of Bayesian methods to deep learning ○ Taking account of prior information to ○ Getting robust uncertainties on predictions ● Allows us to ask: ○ How powerful are your results… really? ○ Is higher accuracy really a significant result? ■ Opens door to DL use for scientific hypothesis testing
  • 18. The paper’s objectives: 1) Widen the bottleneck in BDL --- provide a better benchmark dataset (than UCI) 1) Show off the strong points of BDL --- argue a specific, challenging, real-world example where BDL is needed, medical diagnosis.
  • 19. The paper’s objectives: 1) Widen the bottleneck in BDL --- provide a better benchmark dataset (than UCI) 1) Show off the strong points of BDL --- argue a specific, challenging, real-world example where BDL is needed, medical diagnosis.
  • 20. The paper’s objectives: 1) Widen the bottleneck in BDL --- provide a better benchmark dataset (than UCI)
  • 21. A better benchmark dataset for BDL ● Step 1: Choose an existing dataset that’s suited for BDL’s strengths: ○ 1) Highly dimensional ○ 2) Large dataset ○ 3) Requiring more complex models ● Step 2: Enhance suitability for BDL benchmarking ○ 1) Pre-process the dataset. ○ 2) Develop API for benchmarking.
  • 22. A better benchmark dataset for BDL ● Step 1: Choose an existing dataset that’s suited for BDL’s strengths: ○ 1) Highly dimensional ○ 2) Large dataset ○ 3) Requiring more complex models ● Step 2: Enhance suitability for BDL benchmarking ○ 1) Pre-process the dataset. ○ 2) Develop API for benchmarking.
  • 23. A better benchmark dataset for BDL ● Step 1: Choose an existing dataset that’s suited for BDL’s strengths: ○ 1) Highly dimensional ○ 2) Large dataset ○ 3) Requiring more complex models ● Step 2: Enhance suitability for BDL benchmarking ○ 1) Pre-process the dataset. ○ 2) Develop API for benchmarking.
  • 24. A better benchmark dataset for BDL ● Step 1: Choose an existing dataset that’s suited for BDL’s strengths: ○ 1) Highly dimensional ○ 2) Large number of examples ○ 3) Requiring more complex models ● Step 2: Enhance suitability for BDL benchmarking ○ 1) Pre-process the dataset. ○ 2) Develop API for benchmarking.
  • 25. A better benchmark dataset for BDL ● Step 1: Choose an existing dataset that’s suited for BDL’s strengths: ○ 1) Highly dimensional ○ 2) Large number of examples ○ 3) Requiring more complex models ● Step 2: Enhance suitability for BDL benchmarking ○ 1) Pre-process the dataset. ○ 2) Develop API for benchmarking.
  • 26. A better benchmark dataset for BDL ● Step 1: Choose an existing highly dimensional, large dataset…. Diabetic retinopathy “fundus” images (Kaggle dataset)
  • 27. A better benchmark dataset for BDL ● Step 1: Choose an existing highly dimensional, large dataset…. Diabetic retinopathy (DR) “fundus” images (Kaggle dataset)
  • 28. A better benchmark dataset for BDL ● Step 2: Pre-process the dataset: ○ Redefine the problem… 5-classes of diabetic retinopathy (DR) to Binary 0: No DR 1: Mild DR 2: Moderate DR 3: Severe DR 4. Critical DR 0: Sight not in danger 1: Sight in danger
  • 29. A better benchmark dataset for BDL ● Step 2: Pre-process the dataset: ○ Augment data: Make it challenging enough for BDL.
  • 30. Objective 2)Show an example where BDL is needed ● Giving predictions with uncertainties ● Informing medical diagnosis ● Streamlining patient referrals
  • 31. Objective 2) Show an example where BDL is needed ● Giving predictions with uncertainties ● Informing medical diagnosis ● Streamlining patient referrals Automatic Final Diagnosis
  • 32. Objective 2) Show an example where BDL is needed ● Giving predictions with uncertainties ● Informing medical diagnosis ● Streamlining patient referrals Automatic Final Diagnosis Referral to “real” doctor
  • 33. A better benchmark dataset for BDL
  • 34. Method: Compare four BDL techniques ● Bayesian Neural Networks: ○ 1) Mean-field variational inference (MFVI) ○ 2) Monte Carlo Dropout (MC Dropout)
  • 35. Four methods to compare.. ● Bayesian Neural Networks: ○ 1) Mean-field variational inference (MFVI) ○ 2) Monte Carlo Dropout (MC Dropout)
  • 36. Four methods to compare.. ● Bayesian Neural Networks: ○ 1) Mean-field variational inference (MFVI) ○ 2) Monte Carlo Dropout (MC Dropout) ● 3) Model Ensembling --- “Deep Ensemble”
  • 37. Four methods to compare.. ● Bayesian Neural Networks: ○ 1) Mean-field variational inference (MFVI) ○ 2) Monte Carlo Dropout (MC Dropout) ● 3) Model Ensembling -- “Deep Ensemble” ● 4) Combine (2) and (3) -- “Ensemble MC Dropout”
  • 38. Four methods to compare.. ● Bayesian Neural Networks: ○ 1) Mean-field variational inference (MFVI) ○ 2) Monte Carlo Dropout (MC Dropout) ● 3) Model Ensembling ● 4) Combine (2) and (3) ● 5*) Deterministic baseline
  • 39. Bayesian Neural Networks ● 1) Mean-field Variational Inference ● 2) Monte Carlo Dropout
  • 40. Bayesian Neural Networks ● 1) Mean-field Variational Inference
  • 41. Bayesian Neural Networks ● 1) Mean-field Variational Inference ● 2) Monte-Carlo Dropout
  • 42. 3) Model Ensembling ● No special training or inference techniques.
  • 43. 3) Model Ensembling ● No special training or inference techniques.
  • 44. 3) Model Ensembling ● Just train a bunch of models in parallel, with different ICs
  • 45. 3) Model Ensembling ● Can be combined with MC Dropout
  • 46. 4) Ensemble MC Dropout ● An ensemble of MC dropout networks MC simulation dropout applied during test time
  • 48. The state of the art... Is MFVI really the best BDL technique?
  • 49. The state of the art... Is MFVI really the best BDL technique?
  • 50. The state of the art… SPOILER WARNING Is MFVI really the best BDL technique?
  • 51. The state of the art... Is MFVI really the best BDL technique? ● UCI (easy) benchmarks: “Yes”
  • 52. The state of the art... Is MFVI really the best BDL technique? ● UCI (easy) benchmarks: “Yes” ● This paper (hard) benchmark: “No”
  • 53. Comparison of Various Approaches: Data retention
  • 54. Comparison of Various Approaches: Data retention In-domain (Kaggle DR)
  • 55. Comparison of Various Approaches: Data retention In-domain (Kaggle DR) Out-of-domain (India blindness detection dataset)
  • 56. Comparison of Various Approaches: Data retention In-domain (Kaggle DR) Out-of-domain (India blindness detection dataset) All models converge on full dataset… (within std error bar) Uncertainty comparison is fair.
  • 57. Comparison of Various Approaches: Data retention Ensemble MC Dropout Always Performs best at 50% data retention
  • 58. Comparison of Various Approaches: Data retention
  • 59. Major conclusions... ● Over use of UCI may have misled the BDL community.
  • 60. Major conclusions... ● Over use of UCI may have misled the BDL community. ● Harder benchmarks give a better picture of BDL method performance
  • 61. Major conclusions... ● Over use of UCI may have misled the BDL community. ● Harder benchmarks give a better picture of BDL method performance ● BDL methods are suited for cases where uncertainty is critical for the downstream decision task… (medical diagnosis, re-evaluation.

Editor's Notes

  1. Page 2, Section 1 UCI dataset archive: https://archive.ics.uci.edu/ml/datasets.php?format=&task=&att=&area=&numAtt=&numIns=&type=&sort=nameUp&view=table
  2. Page 2, Section 1 UCI dataset archive: https://archive.ics.uci.edu/ml/datasets.php?format=&task=&att=&area=&numAtt=&numIns=&type=&sort=nameUp&view=table
  3. Page 2, Section 1 UCI dataset archive: https://archive.ics.uci.edu/ml/datasets.php?format=&task=&att=&area=&numAtt=&numIns=&type=&sort=nameUp&view=table
  4. You’re not just learning the weights, you’re learning a distribution for each of the weights. You assume a prior and posterior distribution, conventionally a gaussian (unless you have prior info to say otherwise), to reduce computational complexity.
  5. You can also think of building uncertainties in terms of the test output. This is something akin to “bootstrapping”. But how can we bootstrap neural net inference? It’s become popular in the BDL community to do this by applying dropout at test time, and doing a MC simulation of the test output. This builds a distribution of potential outcomes.
  6. Extremely simply compared to the previous techniques --- just train a lot (hence “ensemble”) of deterministic (traditional) models in parallel, with varying random seeds. This gives sense of the range of possible training outcomes.
  7. Extremely simply compared to the previous techniques --- just train a lot (hence “ensemble”) of deterministic (traditional) models in parallel, with varying random seeds. This gives sense of the range of possible training outcomes.
  8. Extremely simply compared to the previous techniques --- just train a lot (hence “ensemble”) of deterministic (traditional) models in parallel, with varying random seeds. This gives sense of the range of possible training outcomes.
  9. Extremely simply compared to the previous techniques --- just train a lot (hence “ensemble”) of deterministic (traditional) models in parallel, with varying random seeds. This gives sense of the range of possible training outcomes.
  10. Extremely simply compared to the previous techniques --- just train a lot (hence “ensemble”) of deterministic (traditional) models in parallel, with varying random seeds. This gives sense of the range of possible training outcomes.