SlideShare a Scribd company logo
Review By Seong Hoon Jung
hoondori@gmail.com
2020.12
요약
• Reasoning about generalization in deep learning
• Couple the Real World, where optimizers take stochastic
gradient seps on empirical loss, to the Ideal World, where
optimizers take stochastic gradient seps on population loss
• Decomposition of test error into: (1) Ideal world test error
plus (2) the gap between the two worlds
• Evidence that this can be small in realistic deep learning
Generalization Gap
• The goal of a generalization theory in supervised learning is
to understand when and why trained models have small test
error
기존
우리의 해석
Reuse
Fresh
Experimental Validation
• How to construct Ideal world (infinite population data)
• CIFAR-5m
• 6 million synthetic CIFAR-10-like images from GAN
• Labeling samples by a 98.5% accurate classification model
• ImageNet-DogBird
• More complex/real images
• Collapsing classes into superclass => 155K images
• Image-based data-augmentation
Experimental setup
Ex) for CIFAR-5m, n=50K, infinite=5M
for ImageNet, n=10K, infinite=155K
Soft error instead of hard error
Claim: Bootstrap error is not BIG !!
Naive한 해석
1. Data가 그렇게 많을 필요가 없다.
2. Algorithm, Architecture choice가
중요할수도 있다.
Bootstrap error is bounded in deep learning setting
Bootstrap error is bounded in deep learning setting
Sample size effect
Sample size가 부족하면 generalization gap이 커진다.
하지만 학습종료 도달 전까지는 gap이 작다
Effect of Data augmentation
• data augmentation does typically reduce the bootstrap gap
• Good data augmentations should (1) not hurt optimization in the Ideal
World (i.e.,not destroy true samples much), and (2) obstruct
optimization in the Real World (so the Real World can improve for
longer before converging)
Effect of pretrained model
Stopping image 의 차이만 있고, generalization gap에는 영향이 없다.
Implicit Bias v.s Explicit optimization
• Current research of Behnam suggest that
• Convet is better generalized well than fully-connected
• Implicit bias of SGD toword convet in the real-world setting (n=50k)
• Instead of studying implicit bias of optimization on the empirical loss, we could study
explicit properties of optimization on the population loss.
• We show that, in fact, this generalization is captured by the fact that convet optimizes
much faster on the population loss than fully-connected
https://youtu.be/xu6fz0Z5RiU
Behnam Neyshabur. Towards learning convolutions from scratch. arXiv preprint arXiv:2007.13657, 2020.
Model Selection
in the Over- and Under-parameterized Regimes
• The same techniques (architectures and training methods) are
used in practice in both over- and under-parameterized
regimes.
• ResNet-101 is competitive both on 1 billion images of
Instagram (when it is under-parameterized) and on 50k
images of CIFAR-10 (when it is overparameterized
Model Selection
in the Over- and Under-parameterized Regimes
• very different considerations in each regime
• In the overparameterized regime, architecture matters for
generalization reasons: there are many ways to fit the train set,
and some architectures lead SGD to minima that generalize better
• In the underparameterized regime, architecture matters for purely
optimization reasons: all models will have small generalization gap
with 1 billion+ samples, but we seek models which are capable of
reaching low values of test loss, and which do so quickly (with few
optimization steps)
Our unified framework
• Our work suggests that these phenomena are closely related:
If the boostrap error is small, then we should expect that
architectures which optimize well in the infinite-data
(underparameterized) regime also generalize well in the
finite-data (overparameterized) regime.
• This unifies the two apriori different principles guiding
model-selection in over and under-parameterized regimes,
and helps understand why the same architectures are used in
both regimes
Conclusion
• Deep Bootstrap framework for understanding generalization
in deep learning
• Compare two worlds…. Gap is small in deep learning setting.
• Real World, finite, reuse
• Ideal World, infinite, fresh
• First step towards characterizing the bootstrap error
• Need further study…

More Related Content

What's hot

InfoGAIL
InfoGAIL InfoGAIL
InfoGAIL
Sungjoon Choi
 
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...
JaeJun Yoo
 
do adversarially robust image net models transfer better
do adversarially robust image net models transfer betterdo adversarially robust image net models transfer better
do adversarially robust image net models transfer better
LEE HOSEONG
 
Deep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter TuningDeep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter Tuning
Shubhmay Potdar
 
Image classification with neural networks
Image classification with neural networksImage classification with neural networks
Image classification with neural networks
Sepehr Rasouli
 
Meta Dropout: Learning to Perturb Latent Features for Generalization
Meta Dropout: Learning to Perturb Latent Features for Generalization Meta Dropout: Learning to Perturb Latent Features for Generalization
Meta Dropout: Learning to Perturb Latent Features for Generalization
MLAI2
 
Learning Theory 101 ...and Towards Learning the Flat Minima
Learning Theory 101 ...and Towards Learning the Flat MinimaLearning Theory 101 ...and Towards Learning the Flat Minima
Learning Theory 101 ...and Towards Learning the Flat Minima
Sangwoo Mo
 
Deep Learning Theory Seminar (Chap 1-2, part 1)
Deep Learning Theory Seminar (Chap 1-2, part 1)Deep Learning Theory Seminar (Chap 1-2, part 1)
Deep Learning Theory Seminar (Chap 1-2, part 1)
Sangwoo Mo
 
Machine Learning Made Simple: Differential evolution
Machine Learning Made Simple: Differential evolutionMachine Learning Made Simple: Differential evolution
Machine Learning Made Simple: Differential evolution
Devansh16
 
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Eff...
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Eff...Switch Transformers: Scaling to Trillion Parameter Models with Simple and Eff...
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Eff...
taeseon ryu
 
[딥논읽] Meta-Transfer Learning for Zero-Shot Super-Resolution paper review
[딥논읽] Meta-Transfer Learning for Zero-Shot Super-Resolution paper review[딥논읽] Meta-Transfer Learning for Zero-Shot Super-Resolution paper review
[딥논읽] Meta-Transfer Learning for Zero-Shot Super-Resolution paper review
taeseon ryu
 
JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHM
JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHMJOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHM
JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHMmailjkb
 

What's hot (12)

InfoGAIL
InfoGAIL InfoGAIL
InfoGAIL
 
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...
 
do adversarially robust image net models transfer better
do adversarially robust image net models transfer betterdo adversarially robust image net models transfer better
do adversarially robust image net models transfer better
 
Deep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter TuningDeep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter Tuning
 
Image classification with neural networks
Image classification with neural networksImage classification with neural networks
Image classification with neural networks
 
Meta Dropout: Learning to Perturb Latent Features for Generalization
Meta Dropout: Learning to Perturb Latent Features for Generalization Meta Dropout: Learning to Perturb Latent Features for Generalization
Meta Dropout: Learning to Perturb Latent Features for Generalization
 
Learning Theory 101 ...and Towards Learning the Flat Minima
Learning Theory 101 ...and Towards Learning the Flat MinimaLearning Theory 101 ...and Towards Learning the Flat Minima
Learning Theory 101 ...and Towards Learning the Flat Minima
 
Deep Learning Theory Seminar (Chap 1-2, part 1)
Deep Learning Theory Seminar (Chap 1-2, part 1)Deep Learning Theory Seminar (Chap 1-2, part 1)
Deep Learning Theory Seminar (Chap 1-2, part 1)
 
Machine Learning Made Simple: Differential evolution
Machine Learning Made Simple: Differential evolutionMachine Learning Made Simple: Differential evolution
Machine Learning Made Simple: Differential evolution
 
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Eff...
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Eff...Switch Transformers: Scaling to Trillion Parameter Models with Simple and Eff...
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Eff...
 
[딥논읽] Meta-Transfer Learning for Zero-Shot Super-Resolution paper review
[딥논읽] Meta-Transfer Learning for Zero-Shot Super-Resolution paper review[딥논읽] Meta-Transfer Learning for Zero-Shot Super-Resolution paper review
[딥논읽] Meta-Transfer Learning for Zero-Shot Super-Resolution paper review
 
JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHM
JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHMJOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHM
JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHM
 

Similar to The deep bootstrap 논문 리뷰

PR-411: Model soups: averaging weights of multiple fine-tuned models improves...
PR-411: Model soups: averaging weights of multiple fine-tuned models improves...PR-411: Model soups: averaging weights of multiple fine-tuned models improves...
PR-411: Model soups: averaging weights of multiple fine-tuned models improves...
Sunghoon Joo
 
Sim-to-Real Transfer in Deep Reinforcement Learning
Sim-to-Real Transfer in Deep Reinforcement LearningSim-to-Real Transfer in Deep Reinforcement Learning
Sim-to-Real Transfer in Deep Reinforcement Learning
atulshah16
 
The deep bootstrap framework review
The deep bootstrap framework reviewThe deep bootstrap framework review
The deep bootstrap framework review
taeseon ryu
 
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
Jinwon Lee
 
Regularization in deep learning
Regularization in deep learningRegularization in deep learning
Regularization in deep learning
Kien Le
 
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Alok Singh
 
Building Continuous Learning Systems
Building Continuous Learning SystemsBuilding Continuous Learning Systems
Building Continuous Learning Systems
Anuj Gupta
 
Hyperparameter Tuning
Hyperparameter TuningHyperparameter Tuning
Hyperparameter Tuning
Jon Lederman
 
MACHINE LEARNING YEAR DL SECOND PART.pptx
MACHINE LEARNING YEAR DL SECOND PART.pptxMACHINE LEARNING YEAR DL SECOND PART.pptx
MACHINE LEARNING YEAR DL SECOND PART.pptx
NAGARAJANS68
 
Model-Based User Interface Optimization: Part IV: ADVANCED TOPICS - At SICSA ...
Model-Based User Interface Optimization: Part IV: ADVANCED TOPICS - At SICSA ...Model-Based User Interface Optimization: Part IV: ADVANCED TOPICS - At SICSA ...
Model-Based User Interface Optimization: Part IV: ADVANCED TOPICS - At SICSA ...
Aalto University
 
Information Theoretic aspect of reinforcement learning
Information Theoretic aspect of reinforcement learningInformation Theoretic aspect of reinforcement learning
Information Theoretic aspect of reinforcement learning
JongsuHa
 
Lec16: Medical Image Registration (Advanced): Deformable Registration
Lec16: Medical Image Registration (Advanced): Deformable RegistrationLec16: Medical Image Registration (Advanced): Deformable Registration
Lec16: Medical Image Registration (Advanced): Deformable Registration
Ulaş Bağcı
 
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual RepresentationsPR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
Jinwon Lee
 
in5490-classification (1).pptx
in5490-classification (1).pptxin5490-classification (1).pptx
in5490-classification (1).pptx
MonicaTimber
 
CAP Theorem - Theory, Implications and Practices
CAP Theorem - Theory, Implications and PracticesCAP Theorem - Theory, Implications and Practices
CAP Theorem - Theory, Implications and Practices
Yoav Francis
 
Lecture 1
Lecture 1Lecture 1
Lecture 1
Aun Akbar
 
lec1.ppt
lec1.pptlec1.ppt
lec1.ppt
SVasuKrishna1
 
Transfer Learning for Improving Model Predictions in Robotic Systems
Transfer Learning for Improving Model Predictions  in Robotic SystemsTransfer Learning for Improving Model Predictions  in Robotic Systems
Transfer Learning for Improving Model Predictions in Robotic Systems
Pooyan Jamshidi
 
Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for D...
 Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for D... Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for D...
Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for D...
Databricks
 

Similar to The deep bootstrap 논문 리뷰 (20)

PR-411: Model soups: averaging weights of multiple fine-tuned models improves...
PR-411: Model soups: averaging weights of multiple fine-tuned models improves...PR-411: Model soups: averaging weights of multiple fine-tuned models improves...
PR-411: Model soups: averaging weights of multiple fine-tuned models improves...
 
Sim-to-Real Transfer in Deep Reinforcement Learning
Sim-to-Real Transfer in Deep Reinforcement LearningSim-to-Real Transfer in Deep Reinforcement Learning
Sim-to-Real Transfer in Deep Reinforcement Learning
 
The deep bootstrap framework review
The deep bootstrap framework reviewThe deep bootstrap framework review
The deep bootstrap framework review
 
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
 
Regularization in deep learning
Regularization in deep learningRegularization in deep learning
Regularization in deep learning
 
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
 
Building Continuous Learning Systems
Building Continuous Learning SystemsBuilding Continuous Learning Systems
Building Continuous Learning Systems
 
Hyperparameter Tuning
Hyperparameter TuningHyperparameter Tuning
Hyperparameter Tuning
 
MACHINE LEARNING YEAR DL SECOND PART.pptx
MACHINE LEARNING YEAR DL SECOND PART.pptxMACHINE LEARNING YEAR DL SECOND PART.pptx
MACHINE LEARNING YEAR DL SECOND PART.pptx
 
Model-Based User Interface Optimization: Part IV: ADVANCED TOPICS - At SICSA ...
Model-Based User Interface Optimization: Part IV: ADVANCED TOPICS - At SICSA ...Model-Based User Interface Optimization: Part IV: ADVANCED TOPICS - At SICSA ...
Model-Based User Interface Optimization: Part IV: ADVANCED TOPICS - At SICSA ...
 
Information Theoretic aspect of reinforcement learning
Information Theoretic aspect of reinforcement learningInformation Theoretic aspect of reinforcement learning
Information Theoretic aspect of reinforcement learning
 
Lec16: Medical Image Registration (Advanced): Deformable Registration
Lec16: Medical Image Registration (Advanced): Deformable RegistrationLec16: Medical Image Registration (Advanced): Deformable Registration
Lec16: Medical Image Registration (Advanced): Deformable Registration
 
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual RepresentationsPR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
 
in5490-classification (1).pptx
in5490-classification (1).pptxin5490-classification (1).pptx
in5490-classification (1).pptx
 
PCA.pptx
PCA.pptxPCA.pptx
PCA.pptx
 
CAP Theorem - Theory, Implications and Practices
CAP Theorem - Theory, Implications and PracticesCAP Theorem - Theory, Implications and Practices
CAP Theorem - Theory, Implications and Practices
 
Lecture 1
Lecture 1Lecture 1
Lecture 1
 
lec1.ppt
lec1.pptlec1.ppt
lec1.ppt
 
Transfer Learning for Improving Model Predictions in Robotic Systems
Transfer Learning for Improving Model Predictions  in Robotic SystemsTransfer Learning for Improving Model Predictions  in Robotic Systems
Transfer Learning for Improving Model Predictions in Robotic Systems
 
Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for D...
 Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for D... Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for D...
Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for D...
 

Recently uploaded

Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
Abida Shariff
 

Recently uploaded (20)

Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 

The deep bootstrap 논문 리뷰

  • 1. Review By Seong Hoon Jung hoondori@gmail.com 2020.12
  • 2. 요약 • Reasoning about generalization in deep learning • Couple the Real World, where optimizers take stochastic gradient seps on empirical loss, to the Ideal World, where optimizers take stochastic gradient seps on population loss • Decomposition of test error into: (1) Ideal world test error plus (2) the gap between the two worlds • Evidence that this can be small in realistic deep learning
  • 3. Generalization Gap • The goal of a generalization theory in supervised learning is to understand when and why trained models have small test error 기존 우리의 해석 Reuse Fresh
  • 4. Experimental Validation • How to construct Ideal world (infinite population data) • CIFAR-5m • 6 million synthetic CIFAR-10-like images from GAN • Labeling samples by a 98.5% accurate classification model • ImageNet-DogBird • More complex/real images • Collapsing classes into superclass => 155K images • Image-based data-augmentation
  • 5. Experimental setup Ex) for CIFAR-5m, n=50K, infinite=5M for ImageNet, n=10K, infinite=155K Soft error instead of hard error
  • 6. Claim: Bootstrap error is not BIG !! Naive한 해석 1. Data가 그렇게 많을 필요가 없다. 2. Algorithm, Architecture choice가 중요할수도 있다.
  • 7. Bootstrap error is bounded in deep learning setting
  • 8. Bootstrap error is bounded in deep learning setting
  • 9. Sample size effect Sample size가 부족하면 generalization gap이 커진다. 하지만 학습종료 도달 전까지는 gap이 작다
  • 10. Effect of Data augmentation • data augmentation does typically reduce the bootstrap gap • Good data augmentations should (1) not hurt optimization in the Ideal World (i.e.,not destroy true samples much), and (2) obstruct optimization in the Real World (so the Real World can improve for longer before converging)
  • 11. Effect of pretrained model Stopping image 의 차이만 있고, generalization gap에는 영향이 없다.
  • 12. Implicit Bias v.s Explicit optimization • Current research of Behnam suggest that • Convet is better generalized well than fully-connected • Implicit bias of SGD toword convet in the real-world setting (n=50k) • Instead of studying implicit bias of optimization on the empirical loss, we could study explicit properties of optimization on the population loss. • We show that, in fact, this generalization is captured by the fact that convet optimizes much faster on the population loss than fully-connected https://youtu.be/xu6fz0Z5RiU Behnam Neyshabur. Towards learning convolutions from scratch. arXiv preprint arXiv:2007.13657, 2020.
  • 13. Model Selection in the Over- and Under-parameterized Regimes • The same techniques (architectures and training methods) are used in practice in both over- and under-parameterized regimes. • ResNet-101 is competitive both on 1 billion images of Instagram (when it is under-parameterized) and on 50k images of CIFAR-10 (when it is overparameterized
  • 14. Model Selection in the Over- and Under-parameterized Regimes • very different considerations in each regime • In the overparameterized regime, architecture matters for generalization reasons: there are many ways to fit the train set, and some architectures lead SGD to minima that generalize better • In the underparameterized regime, architecture matters for purely optimization reasons: all models will have small generalization gap with 1 billion+ samples, but we seek models which are capable of reaching low values of test loss, and which do so quickly (with few optimization steps)
  • 15. Our unified framework • Our work suggests that these phenomena are closely related: If the boostrap error is small, then we should expect that architectures which optimize well in the infinite-data (underparameterized) regime also generalize well in the finite-data (overparameterized) regime. • This unifies the two apriori different principles guiding model-selection in over and under-parameterized regimes, and helps understand why the same architectures are used in both regimes
  • 16. Conclusion • Deep Bootstrap framework for understanding generalization in deep learning • Compare two worlds…. Gap is small in deep learning setting. • Real World, finite, reuse • Ideal World, infinite, fresh • First step towards characterizing the bootstrap error • Need further study…