SlideShare a Scribd company logo
1 of 16
Download to read offline
Review By Seong Hoon Jung
hoondori@gmail.com
2020.12
요약
• Reasoning about generalization in deep learning
• Couple the Real World, where optimizers take stochastic
gradient seps on empirical loss, to the Ideal World, where
optimizers take stochastic gradient seps on population loss
• Decomposition of test error into: (1) Ideal world test error
plus (2) the gap between the two worlds
• Evidence that this can be small in realistic deep learning
Generalization Gap
• The goal of a generalization theory in supervised learning is
to understand when and why trained models have small test
error
기존
우리의 해석
Reuse
Fresh
Experimental Validation
• How to construct Ideal world (infinite population data)
• CIFAR-5m
• 6 million synthetic CIFAR-10-like images from GAN
• Labeling samples by a 98.5% accurate classification model
• ImageNet-DogBird
• More complex/real images
• Collapsing classes into superclass => 155K images
• Image-based data-augmentation
Experimental setup
Ex) for CIFAR-5m, n=50K, infinite=5M
for ImageNet, n=10K, infinite=155K
Soft error instead of hard error
Claim: Bootstrap error is not BIG !!
Naive한 해석
1. Data가 그렇게 많을 필요가 없다.
2. Algorithm, Architecture choice가
중요할수도 있다.
Bootstrap error is bounded in deep learning setting
Bootstrap error is bounded in deep learning setting
Sample size effect
Sample size가 부족하면 generalization gap이 커진다.
하지만 학습종료 도달 전까지는 gap이 작다
Effect of Data augmentation
• data augmentation does typically reduce the bootstrap gap
• Good data augmentations should (1) not hurt optimization in the Ideal
World (i.e.,not destroy true samples much), and (2) obstruct
optimization in the Real World (so the Real World can improve for
longer before converging)
Effect of pretrained model
Stopping image 의 차이만 있고, generalization gap에는 영향이 없다.
Implicit Bias v.s Explicit optimization
• Current research of Behnam suggest that
• Convet is better generalized well than fully-connected
• Implicit bias of SGD toword convet in the real-world setting (n=50k)
• Instead of studying implicit bias of optimization on the empirical loss, we could study
explicit properties of optimization on the population loss.
• We show that, in fact, this generalization is captured by the fact that convet optimizes
much faster on the population loss than fully-connected
https://youtu.be/xu6fz0Z5RiU
Behnam Neyshabur. Towards learning convolutions from scratch. arXiv preprint arXiv:2007.13657, 2020.
Model Selection
in the Over- and Under-parameterized Regimes
• The same techniques (architectures and training methods) are
used in practice in both over- and under-parameterized
regimes.
• ResNet-101 is competitive both on 1 billion images of
Instagram (when it is under-parameterized) and on 50k
images of CIFAR-10 (when it is overparameterized
Model Selection
in the Over- and Under-parameterized Regimes
• very different considerations in each regime
• In the overparameterized regime, architecture matters for
generalization reasons: there are many ways to fit the train set,
and some architectures lead SGD to minima that generalize better
• In the underparameterized regime, architecture matters for purely
optimization reasons: all models will have small generalization gap
with 1 billion+ samples, but we seek models which are capable of
reaching low values of test loss, and which do so quickly (with few
optimization steps)
Our unified framework
• Our work suggests that these phenomena are closely related:
If the boostrap error is small, then we should expect that
architectures which optimize well in the infinite-data
(underparameterized) regime also generalize well in the
finite-data (overparameterized) regime.
• This unifies the two apriori different principles guiding
model-selection in over and under-parameterized regimes,
and helps understand why the same architectures are used in
both regimes
Conclusion
• Deep Bootstrap framework for understanding generalization
in deep learning
• Compare two worlds…. Gap is small in deep learning setting.
• Real World, finite, reuse
• Ideal World, infinite, fresh
• First step towards characterizing the bootstrap error
• Need further study…

More Related Content

What's hot

Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...JaeJun Yoo
 
do adversarially robust image net models transfer better
do adversarially robust image net models transfer betterdo adversarially robust image net models transfer better
do adversarially robust image net models transfer betterLEE HOSEONG
 
Deep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter TuningDeep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter TuningShubhmay Potdar
 
Image classification with neural networks
Image classification with neural networksImage classification with neural networks
Image classification with neural networksSepehr Rasouli
 
Meta Dropout: Learning to Perturb Latent Features for Generalization
Meta Dropout: Learning to Perturb Latent Features for Generalization Meta Dropout: Learning to Perturb Latent Features for Generalization
Meta Dropout: Learning to Perturb Latent Features for Generalization MLAI2
 
Learning Theory 101 ...and Towards Learning the Flat Minima
Learning Theory 101 ...and Towards Learning the Flat MinimaLearning Theory 101 ...and Towards Learning the Flat Minima
Learning Theory 101 ...and Towards Learning the Flat MinimaSangwoo Mo
 
Deep Learning Theory Seminar (Chap 1-2, part 1)
Deep Learning Theory Seminar (Chap 1-2, part 1)Deep Learning Theory Seminar (Chap 1-2, part 1)
Deep Learning Theory Seminar (Chap 1-2, part 1)Sangwoo Mo
 
Machine Learning Made Simple: Differential evolution
Machine Learning Made Simple: Differential evolutionMachine Learning Made Simple: Differential evolution
Machine Learning Made Simple: Differential evolutionDevansh16
 
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Eff...
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Eff...Switch Transformers: Scaling to Trillion Parameter Models with Simple and Eff...
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Eff...taeseon ryu
 
[딥논읽] Meta-Transfer Learning for Zero-Shot Super-Resolution paper review
[딥논읽] Meta-Transfer Learning for Zero-Shot Super-Resolution paper review[딥논읽] Meta-Transfer Learning for Zero-Shot Super-Resolution paper review
[딥논읽] Meta-Transfer Learning for Zero-Shot Super-Resolution paper reviewtaeseon ryu
 
JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHM
JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHMJOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHM
JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHMmailjkb
 

What's hot (12)

InfoGAIL
InfoGAIL InfoGAIL
InfoGAIL
 
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...
 
do adversarially robust image net models transfer better
do adversarially robust image net models transfer betterdo adversarially robust image net models transfer better
do adversarially robust image net models transfer better
 
Deep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter TuningDeep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter Tuning
 
Image classification with neural networks
Image classification with neural networksImage classification with neural networks
Image classification with neural networks
 
Meta Dropout: Learning to Perturb Latent Features for Generalization
Meta Dropout: Learning to Perturb Latent Features for Generalization Meta Dropout: Learning to Perturb Latent Features for Generalization
Meta Dropout: Learning to Perturb Latent Features for Generalization
 
Learning Theory 101 ...and Towards Learning the Flat Minima
Learning Theory 101 ...and Towards Learning the Flat MinimaLearning Theory 101 ...and Towards Learning the Flat Minima
Learning Theory 101 ...and Towards Learning the Flat Minima
 
Deep Learning Theory Seminar (Chap 1-2, part 1)
Deep Learning Theory Seminar (Chap 1-2, part 1)Deep Learning Theory Seminar (Chap 1-2, part 1)
Deep Learning Theory Seminar (Chap 1-2, part 1)
 
Machine Learning Made Simple: Differential evolution
Machine Learning Made Simple: Differential evolutionMachine Learning Made Simple: Differential evolution
Machine Learning Made Simple: Differential evolution
 
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Eff...
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Eff...Switch Transformers: Scaling to Trillion Parameter Models with Simple and Eff...
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Eff...
 
[딥논읽] Meta-Transfer Learning for Zero-Shot Super-Resolution paper review
[딥논읽] Meta-Transfer Learning for Zero-Shot Super-Resolution paper review[딥논읽] Meta-Transfer Learning for Zero-Shot Super-Resolution paper review
[딥논읽] Meta-Transfer Learning for Zero-Shot Super-Resolution paper review
 
JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHM
JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHMJOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHM
JOB SCHEDULING USING ANT COLONY OPTIMIZATION ALGORITHM
 

Similar to The deep bootstrap 논문 리뷰

PR-411: Model soups: averaging weights of multiple fine-tuned models improves...
PR-411: Model soups: averaging weights of multiple fine-tuned models improves...PR-411: Model soups: averaging weights of multiple fine-tuned models improves...
PR-411: Model soups: averaging weights of multiple fine-tuned models improves...Sunghoon Joo
 
Sim-to-Real Transfer in Deep Reinforcement Learning
Sim-to-Real Transfer in Deep Reinforcement LearningSim-to-Real Transfer in Deep Reinforcement Learning
Sim-to-Real Transfer in Deep Reinforcement Learningatulshah16
 
The deep bootstrap framework review
The deep bootstrap framework reviewThe deep bootstrap framework review
The deep bootstrap framework reviewtaeseon ryu
 
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...Jinwon Lee
 
Regularization in deep learning
Regularization in deep learningRegularization in deep learning
Regularization in deep learningKien Le
 
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...Alok Singh
 
Building Continuous Learning Systems
Building Continuous Learning SystemsBuilding Continuous Learning Systems
Building Continuous Learning SystemsAnuj Gupta
 
Hyperparameter Tuning
Hyperparameter TuningHyperparameter Tuning
Hyperparameter TuningJon Lederman
 
MACHINE LEARNING YEAR DL SECOND PART.pptx
MACHINE LEARNING YEAR DL SECOND PART.pptxMACHINE LEARNING YEAR DL SECOND PART.pptx
MACHINE LEARNING YEAR DL SECOND PART.pptxNAGARAJANS68
 
Model-Based User Interface Optimization: Part IV: ADVANCED TOPICS - At SICSA ...
Model-Based User Interface Optimization: Part IV: ADVANCED TOPICS - At SICSA ...Model-Based User Interface Optimization: Part IV: ADVANCED TOPICS - At SICSA ...
Model-Based User Interface Optimization: Part IV: ADVANCED TOPICS - At SICSA ...Aalto University
 
Information Theoretic aspect of reinforcement learning
Information Theoretic aspect of reinforcement learningInformation Theoretic aspect of reinforcement learning
Information Theoretic aspect of reinforcement learningJongsuHa
 
Lec16: Medical Image Registration (Advanced): Deformable Registration
Lec16: Medical Image Registration (Advanced): Deformable RegistrationLec16: Medical Image Registration (Advanced): Deformable Registration
Lec16: Medical Image Registration (Advanced): Deformable RegistrationUlaş Bağcı
 
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual RepresentationsPR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual RepresentationsJinwon Lee
 
in5490-classification (1).pptx
in5490-classification (1).pptxin5490-classification (1).pptx
in5490-classification (1).pptxMonicaTimber
 
CAP Theorem - Theory, Implications and Practices
CAP Theorem - Theory, Implications and PracticesCAP Theorem - Theory, Implications and Practices
CAP Theorem - Theory, Implications and PracticesYoav Francis
 
Transfer Learning for Improving Model Predictions in Robotic Systems
Transfer Learning for Improving Model Predictions  in Robotic SystemsTransfer Learning for Improving Model Predictions  in Robotic Systems
Transfer Learning for Improving Model Predictions in Robotic SystemsPooyan Jamshidi
 
Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for D...
 Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for D... Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for D...
Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for D...Databricks
 

Similar to The deep bootstrap 논문 리뷰 (20)

PR-411: Model soups: averaging weights of multiple fine-tuned models improves...
PR-411: Model soups: averaging weights of multiple fine-tuned models improves...PR-411: Model soups: averaging weights of multiple fine-tuned models improves...
PR-411: Model soups: averaging weights of multiple fine-tuned models improves...
 
Sim-to-Real Transfer in Deep Reinforcement Learning
Sim-to-Real Transfer in Deep Reinforcement LearningSim-to-Real Transfer in Deep Reinforcement Learning
Sim-to-Real Transfer in Deep Reinforcement Learning
 
The deep bootstrap framework review
The deep bootstrap framework reviewThe deep bootstrap framework review
The deep bootstrap framework review
 
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
 
Regularization in deep learning
Regularization in deep learningRegularization in deep learning
Regularization in deep learning
 
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
 
Building Continuous Learning Systems
Building Continuous Learning SystemsBuilding Continuous Learning Systems
Building Continuous Learning Systems
 
Hyperparameter Tuning
Hyperparameter TuningHyperparameter Tuning
Hyperparameter Tuning
 
MACHINE LEARNING YEAR DL SECOND PART.pptx
MACHINE LEARNING YEAR DL SECOND PART.pptxMACHINE LEARNING YEAR DL SECOND PART.pptx
MACHINE LEARNING YEAR DL SECOND PART.pptx
 
Model-Based User Interface Optimization: Part IV: ADVANCED TOPICS - At SICSA ...
Model-Based User Interface Optimization: Part IV: ADVANCED TOPICS - At SICSA ...Model-Based User Interface Optimization: Part IV: ADVANCED TOPICS - At SICSA ...
Model-Based User Interface Optimization: Part IV: ADVANCED TOPICS - At SICSA ...
 
Information Theoretic aspect of reinforcement learning
Information Theoretic aspect of reinforcement learningInformation Theoretic aspect of reinforcement learning
Information Theoretic aspect of reinforcement learning
 
Lec16: Medical Image Registration (Advanced): Deformable Registration
Lec16: Medical Image Registration (Advanced): Deformable RegistrationLec16: Medical Image Registration (Advanced): Deformable Registration
Lec16: Medical Image Registration (Advanced): Deformable Registration
 
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual RepresentationsPR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
 
in5490-classification (1).pptx
in5490-classification (1).pptxin5490-classification (1).pptx
in5490-classification (1).pptx
 
PCA.pptx
PCA.pptxPCA.pptx
PCA.pptx
 
CAP Theorem - Theory, Implications and Practices
CAP Theorem - Theory, Implications and PracticesCAP Theorem - Theory, Implications and Practices
CAP Theorem - Theory, Implications and Practices
 
Lecture 1
Lecture 1Lecture 1
Lecture 1
 
lec1.ppt
lec1.pptlec1.ppt
lec1.ppt
 
Transfer Learning for Improving Model Predictions in Robotic Systems
Transfer Learning for Improving Model Predictions  in Robotic SystemsTransfer Learning for Improving Model Predictions  in Robotic Systems
Transfer Learning for Improving Model Predictions in Robotic Systems
 
Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for D...
 Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for D... Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for D...
Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for D...
 

Recently uploaded

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 

Recently uploaded (20)

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 

The deep bootstrap 논문 리뷰

  • 1. Review By Seong Hoon Jung hoondori@gmail.com 2020.12
  • 2. 요약 • Reasoning about generalization in deep learning • Couple the Real World, where optimizers take stochastic gradient seps on empirical loss, to the Ideal World, where optimizers take stochastic gradient seps on population loss • Decomposition of test error into: (1) Ideal world test error plus (2) the gap between the two worlds • Evidence that this can be small in realistic deep learning
  • 3. Generalization Gap • The goal of a generalization theory in supervised learning is to understand when and why trained models have small test error 기존 우리의 해석 Reuse Fresh
  • 4. Experimental Validation • How to construct Ideal world (infinite population data) • CIFAR-5m • 6 million synthetic CIFAR-10-like images from GAN • Labeling samples by a 98.5% accurate classification model • ImageNet-DogBird • More complex/real images • Collapsing classes into superclass => 155K images • Image-based data-augmentation
  • 5. Experimental setup Ex) for CIFAR-5m, n=50K, infinite=5M for ImageNet, n=10K, infinite=155K Soft error instead of hard error
  • 6. Claim: Bootstrap error is not BIG !! Naive한 해석 1. Data가 그렇게 많을 필요가 없다. 2. Algorithm, Architecture choice가 중요할수도 있다.
  • 7. Bootstrap error is bounded in deep learning setting
  • 8. Bootstrap error is bounded in deep learning setting
  • 9. Sample size effect Sample size가 부족하면 generalization gap이 커진다. 하지만 학습종료 도달 전까지는 gap이 작다
  • 10. Effect of Data augmentation • data augmentation does typically reduce the bootstrap gap • Good data augmentations should (1) not hurt optimization in the Ideal World (i.e.,not destroy true samples much), and (2) obstruct optimization in the Real World (so the Real World can improve for longer before converging)
  • 11. Effect of pretrained model Stopping image 의 차이만 있고, generalization gap에는 영향이 없다.
  • 12. Implicit Bias v.s Explicit optimization • Current research of Behnam suggest that • Convet is better generalized well than fully-connected • Implicit bias of SGD toword convet in the real-world setting (n=50k) • Instead of studying implicit bias of optimization on the empirical loss, we could study explicit properties of optimization on the population loss. • We show that, in fact, this generalization is captured by the fact that convet optimizes much faster on the population loss than fully-connected https://youtu.be/xu6fz0Z5RiU Behnam Neyshabur. Towards learning convolutions from scratch. arXiv preprint arXiv:2007.13657, 2020.
  • 13. Model Selection in the Over- and Under-parameterized Regimes • The same techniques (architectures and training methods) are used in practice in both over- and under-parameterized regimes. • ResNet-101 is competitive both on 1 billion images of Instagram (when it is under-parameterized) and on 50k images of CIFAR-10 (when it is overparameterized
  • 14. Model Selection in the Over- and Under-parameterized Regimes • very different considerations in each regime • In the overparameterized regime, architecture matters for generalization reasons: there are many ways to fit the train set, and some architectures lead SGD to minima that generalize better • In the underparameterized regime, architecture matters for purely optimization reasons: all models will have small generalization gap with 1 billion+ samples, but we seek models which are capable of reaching low values of test loss, and which do so quickly (with few optimization steps)
  • 15. Our unified framework • Our work suggests that these phenomena are closely related: If the boostrap error is small, then we should expect that architectures which optimize well in the infinite-data (underparameterized) regime also generalize well in the finite-data (overparameterized) regime. • This unifies the two apriori different principles guiding model-selection in over and under-parameterized regimes, and helps understand why the same architectures are used in both regimes
  • 16. Conclusion • Deep Bootstrap framework for understanding generalization in deep learning • Compare two worlds…. Gap is small in deep learning setting. • Real World, finite, reuse • Ideal World, infinite, fresh • First step towards characterizing the bootstrap error • Need further study…