SlideShare a Scribd company logo
1 of 36
Learning with Purpose
DEEP RESIDUAL NETWORKS
Kaiming He et al, “Deep Residual Learning for Image Recognition”
Kaiming He et al, “Identity Mappings in Deep Residual Networks”
Andreas Veit et al, “Residual Networks Behave Like Ensembles of Relatively Shallow Networks ”
Learning with Purpose
ResNet @ILSVRC & COCO 2015 Competitions
1st places in all five main tracks
• ImageNet Classification: “Ultra-deep” 152-layer nets
• ImageNet Detection: 16% better than 2nd
• ImageNet Localization: 27% better than 2nd
• COCO Detection: 11% better than 2nd
• COCO Segmentation: 12% better than 2nd
Learning with Purpose
Evolution of Deep Networks
ImageNet Classification Challenge Error rates by year
ImageNet competition results show that the winning solutions have
become deeper and deeper: from 8 layers in 2012 to 200+ layers in
2016.
Learning with Purpose
What Does Depth Mean?
Deep Representation ability
Forward(Data flow)
Learning with Purpose
Learning with Purpose
What Does Depth Mean?
Is learning better networks as easy as stacking more layers?
Backward(Gradient flow)
Learning with Purpose
• The multiplying property of gradients causes the phenomenon
• This can be addressed by:
– Normalized Initialization
– Batch Normalization
– Appropriate activation function
• Sigmoid(x) ReLu(x)
Gradient Vanishing
Learning with Purpose
• Plain networks on Cifar-10
Simply Stacking Layers?
• Plain nets: stacking 3*3 conv layers…
• 56-layer net has higher training error and test error than 20-layer net
Learning with Purpose
Performance Saturation/Degradation
• Overly deep plain nets have higher training error
• A general phenomenon, observed in many datasets.
Learning with Purpose
a shallower
model (18
layers)
a deeper
counterpart
(34 layers)
• Richer solution space
• A deeper model should not have higher training
error
• A solution by construction:
• Original layers: copied from a trained
shallower model
• Extra layers: set as identity
• At least the same training error
• Optimization difficulties: solvers cannot find the
solution when going deeper…
Learning with Purpose
• Keep it simple
• Base on VGG Phylosophy
– All 3*3 conv(almost)
– Spatial size /2 => # filters*2
– Simple design; just deep!
Network Design
Learning with Purpose
Resnet Can Be Deeper
Learning with Purpose
• Define H(x)=F(x)+x, the stacked weight layers try to approximate F(x)
instead of H(x).
Residual Learning Block
If the optimal function is closer to an identity
mapping, it should be easier for the solver to find
the perturbations with reference to an identity
mapping, than to learn the function as a new one
 Introduce neither extra parameter nor computation complexity
 Element-wise addition is performed on all feature maps
Learning with Purpose
• We turn the ReLu activation function after the addition into an identity
mapping
The Insight of Identity Mapping
identity
If f is also an identity mapping: x(l+1) ≡ yl
Learning with Purpose
• Any xl is directly forward-propagation
to any xL, plus residual.
• Any xl is additive outcome
• In contrast to the multiplicity:
Smooth Forward Propagation
Plain network,
Ignoring BN and ReLU
Learning with Purpose
• The gradient flow is also in the form of
addition.
• The gradient of any layer is unlikely to
vanish
• In contrast to the multiplicity:
Smooth Backward Propagation
Learning with Purpose
What if Shortcut Mapping h(x)≠ Identity?
Learning with Purpose
If Scaling the Shortcut
For an extremely deep network (L is large), if for all i, this factor can be exponentially large;
If for all i, this factor can be exponentially small and vanish
Learning with Purpose
• The gating should increase
the representation ability
(parameter increases)
• It’s the optimization rather
than the representation
dominates the results
If Gating the Shortcut
Learning with Purpose
Results of Using Different Types of Shortcut
Identity shortcut is the best
Learning with Purpose
Training curves on CIFAR-10 of various shortcuts
Solid lines denote test error (y-axis on the right), and dashed lines denote training loss (y-axis on the left)
Learning with Purpose
On the Usage of Activation Functions
Proposed
Learning with Purpose
Results of Experiments on Activation
Learning with Purpose
ReLu vs. ReLu+BN
• BN could block propagation
• Keep the shortest path as smooth
as possible
Learning with Purpose
ReLu vs. Identity
• ReLu could block
propagation when the
network is deep
• Pre-activation ease the
difficulty in optimization
Learning with Purpose
ImageNet Results
Learning with Purpose
Conclusion From He
Keep the shortest path as smooth (clean) as possible
By making h(x) and f(x) identity mapping
Forward and backward signals directly flow this path
Features of any layer is additive outcome
1000-layer ResNet can be easily trained and have better accuracy
Learning with Purpose
Further expansion of Residual network
yl
yl+1
fl()
According to previous analysis, and we replace
xl with yl and F with fl
We further expand this expression by unrolling the
recursion in terms of basic input y.
A novel interpretation of residual networks
Learning with Purpose
Example of unrolling
We take L=3 and l=0 for example
of unrolling
The data flows along paths
exponentially from input to
output
We infer that residual networks
have 2^n paths
Learning with Purpose
Different from traditional Neural Network
In traditional NN, each layer only depends on the previous layer
In ResNet, data flows along many paths from input to output. Each path is
a unique configuration of which residual module to enter and which to
skip
Learning with Purpose
Deleting individual module in ResNet
Deleting a layer in residual networks at
test time (a) is equivalent to zeroing
half of the paths.
In ordinary feed-forward networks
(b) such as VGG or AlexNet, deleting
individual layers alters the only viable
path from input to output.
Learning with Purpose
Deleting individual module in ResNet
Learning with Purpose
Deleting many modules in ResNet
One key characteristic of ensembles
is their smooth performance with
respect to the number of members.
When k residual modules are
removed, the effective number of
paths is reduced from 2^n to 2^(n-
k)
Error increases smoothly when randomly deleting several modules from a
residual network
Learning with Purpose
Reordering moduals in ResNet
Error also increases smoothly when re-ordering a residual network by shuffling
building blocks. The degree of reordering is measured by the Kendall Tau
correlation coefficient.
Learning with Purpose
Conclusion
First, unraveled view reveals that residual networks can be viewed as a
collection of many paths, instead of a single ultra deep network
Second, lesion studies show that, although these paths are trained jointly,
they do not strongly depend on each other.
Learning with Purpose
Thank you

More Related Content

What's hot

Image classification with Deep Neural Networks
Image classification with Deep Neural NetworksImage classification with Deep Neural Networks
Image classification with Deep Neural NetworksYogendra Tamang
 
Image classification using cnn
Image classification using cnnImage classification using cnn
Image classification using cnnSumeraHangi
 
Convolutional Neural Network
Convolutional Neural NetworkConvolutional Neural Network
Convolutional Neural NetworkVignesh Suresh
 
Image classification using convolutional neural network
Image classification using convolutional neural networkImage classification using convolutional neural network
Image classification using convolutional neural networkKIRAN R
 
Convolutional neural network
Convolutional neural network Convolutional neural network
Convolutional neural network Yan Xu
 
CNN Machine learning DeepLearning
CNN Machine learning DeepLearningCNN Machine learning DeepLearning
CNN Machine learning DeepLearningAbhishek Sharma
 
Image Classification using deep learning
Image Classification using deep learning Image Classification using deep learning
Image Classification using deep learning Asma-AH
 
Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...
Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...
Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...Universitat Politècnica de Catalunya
 
Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...Universitat Politècnica de Catalunya
 
Lecture 9&10 computer vision segmentation-no_task
Lecture 9&10 computer vision segmentation-no_taskLecture 9&10 computer vision segmentation-no_task
Lecture 9&10 computer vision segmentation-no_taskcairo university
 
Image classification using CNN
Image classification using CNNImage classification using CNN
Image classification using CNNNoura Hussein
 
Image Classification with Deep Learning | DevFest + GDay, George Town, Mala...
Image Classification with Deep Learning  |  DevFest + GDay, George Town, Mala...Image Classification with Deep Learning  |  DevFest + GDay, George Town, Mala...
Image Classification with Deep Learning | DevFest + GDay, George Town, Mala...Virot "Ta" Chiraphadhanakul
 
Data Science - Part XVII - Deep Learning & Image Processing
Data Science - Part XVII - Deep Learning & Image ProcessingData Science - Part XVII - Deep Learning & Image Processing
Data Science - Part XVII - Deep Learning & Image ProcessingDerek Kane
 
Convolution Neural Network (CNN)
Convolution Neural Network (CNN)Convolution Neural Network (CNN)
Convolution Neural Network (CNN)Basit Rafiq
 
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...Simplilearn
 

What's hot (20)

Mask R-CNN
Mask R-CNNMask R-CNN
Mask R-CNN
 
Image classification with Deep Neural Networks
Image classification with Deep Neural NetworksImage classification with Deep Neural Networks
Image classification with Deep Neural Networks
 
Image classification using cnn
Image classification using cnnImage classification using cnn
Image classification using cnn
 
AlexNet
AlexNetAlexNet
AlexNet
 
Convolutional Neural Network
Convolutional Neural NetworkConvolutional Neural Network
Convolutional Neural Network
 
Image classification using convolutional neural network
Image classification using convolutional neural networkImage classification using convolutional neural network
Image classification using convolutional neural network
 
Convolutional neural network
Convolutional neural network Convolutional neural network
Convolutional neural network
 
Deep Learning for Computer Vision: Image Classification (UPC 2016)
Deep Learning for Computer Vision: Image Classification (UPC 2016)Deep Learning for Computer Vision: Image Classification (UPC 2016)
Deep Learning for Computer Vision: Image Classification (UPC 2016)
 
CNN Machine learning DeepLearning
CNN Machine learning DeepLearningCNN Machine learning DeepLearning
CNN Machine learning DeepLearning
 
Deep Learning for Computer Vision: Object Detection (UPC 2016)
Deep Learning for Computer Vision: Object Detection (UPC 2016)Deep Learning for Computer Vision: Object Detection (UPC 2016)
Deep Learning for Computer Vision: Object Detection (UPC 2016)
 
Image Classification using deep learning
Image Classification using deep learning Image Classification using deep learning
Image Classification using deep learning
 
Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...
Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...
Image classification on Imagenet (D1L4 2017 UPC Deep Learning for Computer Vi...
 
cnn ppt.pptx
cnn ppt.pptxcnn ppt.pptx
cnn ppt.pptx
 
Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...
 
Lecture 9&10 computer vision segmentation-no_task
Lecture 9&10 computer vision segmentation-no_taskLecture 9&10 computer vision segmentation-no_task
Lecture 9&10 computer vision segmentation-no_task
 
Image classification using CNN
Image classification using CNNImage classification using CNN
Image classification using CNN
 
Image Classification with Deep Learning | DevFest + GDay, George Town, Mala...
Image Classification with Deep Learning  |  DevFest + GDay, George Town, Mala...Image Classification with Deep Learning  |  DevFest + GDay, George Town, Mala...
Image Classification with Deep Learning | DevFest + GDay, George Town, Mala...
 
Data Science - Part XVII - Deep Learning & Image Processing
Data Science - Part XVII - Deep Learning & Image ProcessingData Science - Part XVII - Deep Learning & Image Processing
Data Science - Part XVII - Deep Learning & Image Processing
 
Convolution Neural Network (CNN)
Convolution Neural Network (CNN)Convolution Neural Network (CNN)
Convolution Neural Network (CNN)
 
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
 

Similar to Resnet.pptx

Resnet.pdf
Resnet.pdfResnet.pdf
Resnet.pdfYanhuaSi
 
Understanding Convolutional Neural Networks
Understanding Convolutional Neural NetworksUnderstanding Convolutional Neural Networks
Understanding Convolutional Neural NetworksJeremy Nixon
 
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018Universitat Politècnica de Catalunya
 
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Gaurav Mittal
 
Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...
Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...
Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...Universitat Politècnica de Catalunya
 
Review-image-segmentation-by-deep-learning
Review-image-segmentation-by-deep-learningReview-image-segmentation-by-deep-learning
Review-image-segmentation-by-deep-learningTrong-An Bui
 
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중datasciencekorea
 
Introduction to CNN Models: DenseNet & MobileNet
Introduction to CNN Models: DenseNet & MobileNetIntroduction to CNN Models: DenseNet & MobileNet
Introduction to CNN Models: DenseNet & MobileNetKrishnakoumarC
 
MLIP - Chapter 3 - Introduction to deep learning
MLIP - Chapter 3 - Introduction to deep learningMLIP - Chapter 3 - Introduction to deep learning
MLIP - Chapter 3 - Introduction to deep learningCharles Deledalle
 
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...thanhdowork
 
DEF CON 24 - Clarence Chio - machine duping 101
DEF CON 24 - Clarence Chio - machine duping 101DEF CON 24 - Clarence Chio - machine duping 101
DEF CON 24 - Clarence Chio - machine duping 101Felipe Prado
 
A brief introduction to recent segmentation methods
A brief introduction to recent segmentation methodsA brief introduction to recent segmentation methods
A brief introduction to recent segmentation methodsShunta Saito
 
Computer Vision for Beginners
Computer Vision for BeginnersComputer Vision for Beginners
Computer Vision for BeginnersSanghamitra Deb
 
intro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptxintro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptxssuser3aa461
 
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr TeterwakLearn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr TeterwakPyData
 

Similar to Resnet.pptx (20)

Resnet.pdf
Resnet.pdfResnet.pdf
Resnet.pdf
 
Deep Residual Learning for Image Recognition
Deep Residual Learning for Image RecognitionDeep Residual Learning for Image Recognition
Deep Residual Learning for Image Recognition
 
Understanding Convolutional Neural Networks
Understanding Convolutional Neural NetworksUnderstanding Convolutional Neural Networks
Understanding Convolutional Neural Networks
 
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
 
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)
 
ResNet.pptx
ResNet.pptxResNet.pptx
ResNet.pptx
 
Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...
Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...
Transfer Learning and Domain Adaptation (DLAI D5L2 2017 UPC Deep Learning for...
 
Convolutional neural networks
Convolutional neural  networksConvolutional neural  networks
Convolutional neural networks
 
ResNet.pptx
ResNet.pptxResNet.pptx
ResNet.pptx
 
Review-image-segmentation-by-deep-learning
Review-image-segmentation-by-deep-learningReview-image-segmentation-by-deep-learning
Review-image-segmentation-by-deep-learning
 
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
 
Introduction to CNN Models: DenseNet & MobileNet
Introduction to CNN Models: DenseNet & MobileNetIntroduction to CNN Models: DenseNet & MobileNet
Introduction to CNN Models: DenseNet & MobileNet
 
MLIP - Chapter 3 - Introduction to deep learning
MLIP - Chapter 3 - Introduction to deep learningMLIP - Chapter 3 - Introduction to deep learning
MLIP - Chapter 3 - Introduction to deep learning
 
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...
 
DEF CON 24 - Clarence Chio - machine duping 101
DEF CON 24 - Clarence Chio - machine duping 101DEF CON 24 - Clarence Chio - machine duping 101
DEF CON 24 - Clarence Chio - machine duping 101
 
Large Scale Distributed Deep Networks
Large Scale Distributed Deep NetworksLarge Scale Distributed Deep Networks
Large Scale Distributed Deep Networks
 
A brief introduction to recent segmentation methods
A brief introduction to recent segmentation methodsA brief introduction to recent segmentation methods
A brief introduction to recent segmentation methods
 
Computer Vision for Beginners
Computer Vision for BeginnersComputer Vision for Beginners
Computer Vision for Beginners
 
intro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptxintro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptx
 
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr TeterwakLearn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
 

Recently uploaded

Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAbdelrhman abooda
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...ThinkInnovation
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 

Recently uploaded (20)

Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
Decoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in ActionDecoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in Action
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 

Resnet.pptx

  • 1. Learning with Purpose DEEP RESIDUAL NETWORKS Kaiming He et al, “Deep Residual Learning for Image Recognition” Kaiming He et al, “Identity Mappings in Deep Residual Networks” Andreas Veit et al, “Residual Networks Behave Like Ensembles of Relatively Shallow Networks ”
  • 2. Learning with Purpose ResNet @ILSVRC & COCO 2015 Competitions 1st places in all five main tracks • ImageNet Classification: “Ultra-deep” 152-layer nets • ImageNet Detection: 16% better than 2nd • ImageNet Localization: 27% better than 2nd • COCO Detection: 11% better than 2nd • COCO Segmentation: 12% better than 2nd
  • 3. Learning with Purpose Evolution of Deep Networks ImageNet Classification Challenge Error rates by year ImageNet competition results show that the winning solutions have become deeper and deeper: from 8 layers in 2012 to 200+ layers in 2016.
  • 4. Learning with Purpose What Does Depth Mean? Deep Representation ability Forward(Data flow)
  • 6. Learning with Purpose What Does Depth Mean? Is learning better networks as easy as stacking more layers? Backward(Gradient flow)
  • 7. Learning with Purpose • The multiplying property of gradients causes the phenomenon • This can be addressed by: – Normalized Initialization – Batch Normalization – Appropriate activation function • Sigmoid(x) ReLu(x) Gradient Vanishing
  • 8. Learning with Purpose • Plain networks on Cifar-10 Simply Stacking Layers? • Plain nets: stacking 3*3 conv layers… • 56-layer net has higher training error and test error than 20-layer net
  • 9. Learning with Purpose Performance Saturation/Degradation • Overly deep plain nets have higher training error • A general phenomenon, observed in many datasets.
  • 10. Learning with Purpose a shallower model (18 layers) a deeper counterpart (34 layers) • Richer solution space • A deeper model should not have higher training error • A solution by construction: • Original layers: copied from a trained shallower model • Extra layers: set as identity • At least the same training error • Optimization difficulties: solvers cannot find the solution when going deeper…
  • 11. Learning with Purpose • Keep it simple • Base on VGG Phylosophy – All 3*3 conv(almost) – Spatial size /2 => # filters*2 – Simple design; just deep! Network Design
  • 13. Learning with Purpose • Define H(x)=F(x)+x, the stacked weight layers try to approximate F(x) instead of H(x). Residual Learning Block If the optimal function is closer to an identity mapping, it should be easier for the solver to find the perturbations with reference to an identity mapping, than to learn the function as a new one  Introduce neither extra parameter nor computation complexity  Element-wise addition is performed on all feature maps
  • 14. Learning with Purpose • We turn the ReLu activation function after the addition into an identity mapping The Insight of Identity Mapping identity If f is also an identity mapping: x(l+1) ≡ yl
  • 15. Learning with Purpose • Any xl is directly forward-propagation to any xL, plus residual. • Any xl is additive outcome • In contrast to the multiplicity: Smooth Forward Propagation Plain network, Ignoring BN and ReLU
  • 16. Learning with Purpose • The gradient flow is also in the form of addition. • The gradient of any layer is unlikely to vanish • In contrast to the multiplicity: Smooth Backward Propagation
  • 17. Learning with Purpose What if Shortcut Mapping h(x)≠ Identity?
  • 18. Learning with Purpose If Scaling the Shortcut For an extremely deep network (L is large), if for all i, this factor can be exponentially large; If for all i, this factor can be exponentially small and vanish
  • 19. Learning with Purpose • The gating should increase the representation ability (parameter increases) • It’s the optimization rather than the representation dominates the results If Gating the Shortcut
  • 20. Learning with Purpose Results of Using Different Types of Shortcut Identity shortcut is the best
  • 21. Learning with Purpose Training curves on CIFAR-10 of various shortcuts Solid lines denote test error (y-axis on the right), and dashed lines denote training loss (y-axis on the left)
  • 22. Learning with Purpose On the Usage of Activation Functions Proposed
  • 23. Learning with Purpose Results of Experiments on Activation
  • 24. Learning with Purpose ReLu vs. ReLu+BN • BN could block propagation • Keep the shortest path as smooth as possible
  • 25. Learning with Purpose ReLu vs. Identity • ReLu could block propagation when the network is deep • Pre-activation ease the difficulty in optimization
  • 27. Learning with Purpose Conclusion From He Keep the shortest path as smooth (clean) as possible By making h(x) and f(x) identity mapping Forward and backward signals directly flow this path Features of any layer is additive outcome 1000-layer ResNet can be easily trained and have better accuracy
  • 28. Learning with Purpose Further expansion of Residual network yl yl+1 fl() According to previous analysis, and we replace xl with yl and F with fl We further expand this expression by unrolling the recursion in terms of basic input y. A novel interpretation of residual networks
  • 29. Learning with Purpose Example of unrolling We take L=3 and l=0 for example of unrolling The data flows along paths exponentially from input to output We infer that residual networks have 2^n paths
  • 30. Learning with Purpose Different from traditional Neural Network In traditional NN, each layer only depends on the previous layer In ResNet, data flows along many paths from input to output. Each path is a unique configuration of which residual module to enter and which to skip
  • 31. Learning with Purpose Deleting individual module in ResNet Deleting a layer in residual networks at test time (a) is equivalent to zeroing half of the paths. In ordinary feed-forward networks (b) such as VGG or AlexNet, deleting individual layers alters the only viable path from input to output.
  • 32. Learning with Purpose Deleting individual module in ResNet
  • 33. Learning with Purpose Deleting many modules in ResNet One key characteristic of ensembles is their smooth performance with respect to the number of members. When k residual modules are removed, the effective number of paths is reduced from 2^n to 2^(n- k) Error increases smoothly when randomly deleting several modules from a residual network
  • 34. Learning with Purpose Reordering moduals in ResNet Error also increases smoothly when re-ordering a residual network by shuffling building blocks. The degree of reordering is measured by the Kendall Tau correlation coefficient.
  • 35. Learning with Purpose Conclusion First, unraveled view reveals that residual networks can be viewed as a collection of many paths, instead of a single ultra deep network Second, lesion studies show that, although these paths are trained jointly, they do not strongly depend on each other.

Editor's Notes

  1. Hi today I am gona to introduce Deep residual networks. This presentation is about 3 papers. The first two are from Kaiming He and his team, and the third one is a novel interpretation of residual network. I know all of you are familiar with resnet, so if there is anything I don’t understand it right or if you think there is anything I should know, please don’t hesitate to tell me.
  2. although you may know about the contribution and competition of what resnet did, I still want share this with you. Resnet won a lot of competitions. It won 1st places in all five main tracks. Like Imagenet classification/ detection localization and coco detection and segmentation.
  3. From the picture o evolution of deep networks, we can see that the winning solutions have become deeper and deeper, it is from 8 layers in 2012 to 200+ layers in 2016. resnet brought a big improvement in the performance,
  4. It is noteworthy that the gating and 1×1 convolutional shortcuts introduce more parameters, and should have stronger representational abilities than identity shortcuts. However, their training error is higher than that of identity shortcuts, indicating that the degradation of these models is caused by optimization issues, instead of representational abilities.