GoogLeNet.pptx

•Download as PPTX, PDF•

0 likes•255 views

This document summarizes the GoogLeNet deep learning architecture. It describes how GoogLeNet uses inception modules containing 1x1 convolutional layers to reduce computational load. The inception modules perform 1x1, 3x3, and 5x5 convolutions in parallel, with the 1x1 layers reducing dimensionality first. This allows GoogLeNet to have significantly more parameters than VGGNet but higher accuracy and less computational resources. The document also explains how auxiliary classifiers are added to intermediate layers to address vanishing gradients in the deep model.

Education

Min-Seo Kim
Network Science Lab
Dept. of Artificial Intelligence
The Catholic University of Korea
E-mail: kms39273@naver.com

2
GoogLeNet
• The GoogLeNet submission to ILSVRC 2014 used 12× fewer parameters than the winning architecture, VGG,
from two years prior, yet it was significantly more accurate.
• Notable factor is that with the ongoing traction of mobile and embedded computing, the efficiency of our
algorithms – especially their power and memory use – gains importance.
Introduction

3
GoogLeNet
• GoogLeNet have typically had a standard structure – stacked convolutional layers (optionally followed by
contrast normalization and maxpooling) are followed by one or more fully-connected layers.
• Use 1×1 convolutional layers, ReLU activation function in Network-in-Network.
Related Work

4
GoogLeNet
• The most straightforward way of improving the performance of deep neural networks is by increasing their
size.
• However this simple solution comes with two major drawbacks.
• Bigger size typically means a larger number of parameters, which makes the enlarged network more
prone to overfitting. To prevent overfitting, creation of high quality training sets can be tricky and
expensive.
• Uniformly increased network size is the dramatically increased use of computational resources.
• Since in practice the computational budget is always finite, an efficient distribution of computing resources is
preferred to an indiscriminate increase of size
Motivation and High Level Considerations

5
GoogLeNet - Architectural Details
• To effectively extract feature maps, 1x1, 3x3, and 5x5 convolution filters are performed in parallel.
• However, this inevitably increases the computational load.
Inception module

6
GoogLeNet - Architectural Details
• Therefore, to address this issue, the 1x1 convolution filter was used.
• By placing it before the 3x3 and 5x5 filters, it reduces the dimensions, which in turn reduces the
computational load and introduces non-linearity.
Inception module

7
GoogLeNet - Architectural Details
- input tensor = 28X28X192
- convolution filter = 5X5X192
- padding = 2
- strride = 1
- number of filter = 32
28X28X192X5X5X32=1.2 billion times
How does the 1x1 conv filter reduce the amount of computation?
- input tensor = 28X28X192
- convolution filter = 1X1X16
- number of filter = 16
192X1X1X28X28X16=2.4 million
operations
- input tensor = 28X28X16
- convolution filter = 5X5X192
- padding = 2
- strride = 1
- number of filter = 32
16x5x5x28x28x32 = 10 million operations
Total of 12.4 million operations.
The number of operations has decreased tenfold.
The non-linearity has increased.

8
GoogLeNet - Architectural Details
• This is the parameter calculation for the Inception 3a module inside the actual GoogLeNet.
Inception in GoogLeNet(inception 3a)

9
GoogLeNet - Architectural Details
Entire GoogLeNet

10
GoogLeNet - Architectural Details
• This is where the lower layers are located, close to the input image.
• For efficient memory usage, we applied a basic CNN-type model in the lower layer.
• The Inception module is used in the higher layers, so it is not used in this part.
Part 1

11
GoogLeNet - Architectural Details
• To extract various features, the Inception module described earlier is implemented.
Part 2

12
GoogLeNet - Architectural Details
• As the depth of the model becomes very deep, the vanishing gradient problem can occur even when using
the ReLU activation function.
• We added an auxiliary classifier to the middle layer, which outputs intermediate results so that the gradient
can be passed as an additional backprop.
• To prevent it from having too much influence, the loss of the auxiliary classifier is multiplied by 0.3 and added
to the total loss of the entire network.
• In the actual test, we removed the auxiliary classifier and used only the softmax of the far end.
Part 3

13
GoogLeNet - Architectural Details
• This is the end of the model with the prediction results.
• The average pooling layer with global average pooling is applied.
• This reduces the size of the feature map without any additional parameters.
Part 4

14
GoogLeNet
• We presented a new methodology that is different from the existing CNN methods that only build up depth.
• It won the first prize at ILSVRC 2014, beating VGGNet.
Conclusions

What's hot

U-Net is a convolutional neural network (CNN) architecture designed for semantic segmentation tasks, especially in the field of medical image analysis. It was introduced by Olaf Ronneberger, Philipp Fischer, and Thomas Brox in 2015. The name "U-Net" comes from its U-shaped architecture. Key features of the U-Net architecture: U-Shaped Design: U-Net consists of a contracting path (downsampling) and an expansive path (upsampling). The architecture resembles the letter "U" when visualized. Contracting Path (Encoder): The contracting path involves a series of convolutional and pooling layers. Each convolutional layer is followed by a rectified linear unit (ReLU) activation function and possibly other normalization or activation functions. Pooling layers (usually max pooling) reduce spatial dimensions, capturing high-level features. Expansive Path (Decoder): The expansive path involves a series of upsampling and convolutional layers. Upsampling is achieved using transposed convolution (also known as deconvolution or convolutional transpose). Skip connections are established between corresponding layers in the contracting and expansive paths. These connections help retain fine-grained spatial information during the upsampling process. Skip Connections: Skip connections concatenate feature maps from the contracting path to the corresponding layers in the expansive path. These connections facilitate the fusion of low-level and high-level features, aiding in precise localization. Final Layer: The final layer typically uses a convolutional layer with a softmax activation function for multi-class segmentation tasks, providing probability scores for each class. U-Net's architecture and skip connections help address the challenge of segmenting objects with varying sizes and shapes, which is often encountered in medical image analysis. Its success in this domain has led to its application in other areas of computer vision as well. The U-Net architecture has also been extended and modified in various ways, leading to improvements like the U-Net++ architecture and variations with attention mechanisms, which further enhance the segmentation performance. U-Net's intuitive design and effectiveness in semantic segmentation tasks have made it a cornerstone in the field of medical image analysis and an influential architecture for researchers working on segmentation challenges.

UNetEliyaLaialy (2).pptx

NoorUlHaq47

Machine learning for wireless networks @Bestcom2016

Merima Kulin

ViT.pptx

Changjin Lee

Intelligent reflecting surface 2

VARUN KUMAR

LeNet-5

佳蓉倪

https://telecombcn-dl.github.io/dlmm-2017-dcu/ Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of big annotated data and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which had been addressed until now with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.

Transfer Learning (D2L4 Insight@DCU Machine Learning Workshop 2017)

Universitat Politècnica de Catalunya

Domain adaptation

Tomoya Koike

CNN and its applications by ketaki

Ketaki Patwari

Ea 452 chap9

Ritika Khanna

https://telecombcn-dl.github.io/2017-dlcv/ Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.

Transfer Learning and Domain Adaptation (D2L3 2017 UPC Deep Learning for Comp...

Universitat Politècnica de Catalunya

U-Net: Convolutional Networks for Biomedical Image Segmentation

fake can

La CRC: Un Regulador Inteligente

Comisión de Regulación de Comunicaciones

Lecture11

zukun

Interference and system capacity

AJAL A J

Lecture 15 DCT, Walsh and Hadamard Transform

VARUN KUMAR

CRC implementation

ajay singh

PhD Defense

Taehoon Lee

Deep learning for person re-identification

哲东郑

For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2021/10/person-re-identification-and-tracking-at-the-edge-challenges-and-techniques-a-presentation-from-the-university-of-auckland/ Morteza Biglari-Abhari, Senior Lecturer at the University of Auckland, presents the “Person Re-Identification and Tracking at the Edge: Challenges and Techniques” tutorial at the May 2021 Embedded Vision Summit. Numerous video analytics applications require understanding how people are moving through a space, including the ability to recognize when the same person has moved outside of the camera’s view and then back into the camera’s view, or when a person has passed from the view of one camera to the view of another. This capability is referred to as person re-identification and tracking. It’s an essential technique for applications such as surveillance for security, health and safety monitoring in healthcare and industrial facilities, intelligent transportation systems and smart cities. It can also assist in gathering business intelligence such as monitoring customer behavior in shopping environments. Person re-identification is challenging. In this talk, Biglari-Abhari discusses the key challenges and current approaches for person re-identification and tracking, as well as his initial work on multi-camera systems and techniques to improve accuracy, especially fusing appearance and spatio-temporal models. He also briefly discusses privacy-preserving techniques, which are critical for some applications, as well as challenges for real-time processing at the edge.

“Person Re-Identification and Tracking at the Edge: Challenges and Techniques...

Edge AI and Vision Alliance

Image Restoration (Frequency Domain Filters):Basics

Kalyan Acharjya

What's hot (20)

UNetEliyaLaialy (2).pptx

Machine learning for wireless networks @Bestcom2016

ViT.pptx

Intelligent reflecting surface 2

LeNet-5

Transfer Learning (D2L4 Insight@DCU Machine Learning Workshop 2017)

Domain adaptation

CNN and its applications by ketaki

Ea 452 chap9

Transfer Learning and Domain Adaptation (D2L3 2017 UPC Deep Learning for Comp...

U-Net: Convolutional Networks for Biomedical Image Segmentation

La CRC: Un Regulador Inteligente

Lecture11

Interference and system capacity

Lecture 15 DCT, Walsh and Hadamard Transform

CRC implementation

PhD Defense

Deep learning for person re-identification

“Person Re-Identification and Tracking at the Edge: Challenges and Techniques...

Image Restoration (Frequency Domain Filters):Basics

Similar to GoogLeNet.pptx

Recently, Convolutional Neural Networks have been successfully applied to image segmentation tasks. Here we present some of the most recent techniques that increased the accuracy in such tasks. First we describe the Inception architecture and its evolution, which allowed to increase width and depth of the network without increasing the computational burden. We then show how to adapt classification networks into fully convolutional networks, able to perform pixel-wise classification for segmentation tasks. We finally introduce the hypercolumn technique to further improve state-of-the-art on various fine-grained localization tasks.

Modern Convolutional Neural Network techniques for image segmentation

Gioele Ciaparrone

Tensorfkow-KR 논문읽기모임 PR12 144번째 논문 review입니다. 이번에는 Efficient CNN의 대표 중 하나인 SqueezeNext를 review해보았습니다. SqueezeNext의 전신인 SqueezeNet도 같이 review하였고, CNN을 평가하는 metric에 대한 논문인 NetScore에서 SqueezeNext가 1등을 하여 NetScore도 같이 review하였습니다. 논문링크: SqueezeNext - https://arxiv.org/abs/1803.10615 SqueezeNet - https://arxiv.org/abs/1602.07360 NetScore - https://arxiv.org/abs/1806.05512 영상링크: https://youtu.be/WReWeADJ3Pw

PR-144: SqueezeNext: Hardware-Aware Neural Network Design

Jinwon Lee

PR-120: ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture De...

Jinwon Lee

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks.pptx

ssuser2624f71

240422_Thuy_Labseminar[Large Graph Property Prediction via Graph Segment Trai...

thanhdowork

[2020 CVPR Efficient DET 논문리뷰] 안녕하세요 딥러닝 논문읽기 모임입니다. 오늘 소개드릴 논문은 2020 CVPR에서 발표된, Efficient Net 저자가 발표한 'Efficient DET'입니다. 제목에서 유추 가능하듯 Backbone을 Efficient Net으로 사용하여 Object Detection Task에 적용 했다는 점을 유추할 수 있습니다. 해당 논문은 위 사실을 제외하고, 조금 특별한 Feature Pyramid Network를 적용하여 더욱 성능적으로 향상을 시켰는대요. 이러한 내용을 바탕으로 아직까지도 paperswithcode에 상위권에 랭크되어 있는 논문 입니다. 오늘 논문 이미지처리팀 이찬혁님이 자세하고 디테일한 리뷰를 도와주셨습니다. 많은 관심 미리 감사드립니다! https://youtu.be/Mq4aqDgZ2bI

[2020 CVPR Efficient DET paper review]

taeseon ryu

IRJET- Mango Classification using Convolutional Neural Networks

IRJET Journal

Sp19_P2.pptx

Md Abul Hayat

Traffic Sign Recognition System

IRJET Journal

08 neural networks

ankit_ppt

Explained here: https://youtu.be/aBvDPL1jFnI In Nepali A ConvNet for the 2020s (Zhuang Liu et al.) ComvNeXt paper Deep Learning for Visual Intelligence Sushant Gautam MSCIISE Department of Electronics and Computer Engineering Institute of Engineering, Thapathali Campus 13 March 2022 To all the authors (obviously!!) 1. Jinwon Lee's slides at https://www.slideshare.net/JinwonLee9/pr366-a-convnet-for-2020s?qid=274bc524-23ae-4c13-b03b-0d2416976ad5&v=&b=&from_search=1 2. Letitia from AI Coffee Break: https://www.youtube.com/watch?v=SndHALawoag I even edited some of her hard visual works and put them as a slide. :(

ConvNeXt: A ConvNet for the 2020s explained

Sushant Gautam

lec6a.ppt

SaadMemon23

Multipliers play an important role in today’s digital signal processing (DSP) and various other applications. Multiplication is the most time consuming process in various signal processing operations like convolution, circular convolution, auto-correlation and cross-correlation. With advances in technology, many researchers have tried and are trying to design multipliers which offer either of the following- high speed, low power consumption, regularity of layout and hence less area or even combination of them in multiplier. However area and speed are two conflicting constraints. So improving speed results always in larger areas. So here we try to find out the best trade off solution among the both of them. To have features like high speed and low power consumption multipliers several algorithms have been introduced .In this paper, we describes Multipliers by using various algorithm in VLSI technology.

IJET-V3I1P14

IJET - International Journal of Engineering and Techniques

Graph Transformer with Graph Pooling for Node Classification, IJCAI 2023.pptx

ssuser2624f71

IRJET- Image Classification – Cat and Dog Images

IRJET Journal

NS-CUK Seminar: S.T.Nguyen, Review on "Hierarchical Graph Convolutional Netwo...

ssuser4b1f48

(Im2col)accelerating deep neural networks on low power heterogeneous architec...

Bomm Kim

Recent Advances in CPLEX 12.6.1

IBM Decision Optimization

Handwritten Digit Recognition(Convolutional Neural Network) PPT

RishabhTyagi48

VGG.pptx

ssuser2624f71

Similar to GoogLeNet.pptx (20)

Modern Convolutional Neural Network techniques for image segmentation

PR-144: SqueezeNext: Hardware-Aware Neural Network Design

PR-120: ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture De...

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks.pptx

240422_Thuy_Labseminar[Large Graph Property Prediction via Graph Segment Trai...

[2020 CVPR Efficient DET paper review]

IRJET- Mango Classification using Convolutional Neural Networks

Sp19_P2.pptx

Traffic Sign Recognition System

08 neural networks

ConvNeXt: A ConvNet for the 2020s explained

lec6a.ppt

IJET-V3I1P14

Graph Transformer with Graph Pooling for Node Classification, IJCAI 2023.pptx

IRJET- Image Classification – Cat and Dog Images

NS-CUK Seminar: S.T.Nguyen, Review on "Hierarchical Graph Convolutional Netwo...

(Im2col)accelerating deep neural networks on low power heterogeneous architec...

Recent Advances in CPLEX 12.6.1

Handwritten Digit Recognition(Convolutional Neural Network) PPT

VGG.pptx

Recently uploaded

This is a presentation by Dada Robert in a Your Skill Boost masterclass organised by the Excellence Foundation for South Sudan (EFSS) on Saturday, the 25th and Sunday, the 26th of May 2024. He discussed the concept of quality improvement, emphasizing its applicability to various aspects of life, including personal, project, and program improvements. He defined quality as doing the right thing at the right time in the right way to achieve the best possible results and discussed the concept of the "gap" between what we know and what we do, and how this gap represents the areas we need to improve. He explained the scientific approach to quality improvement, which involves systematic performance analysis, testing and learning, and implementing change ideas. He also highlighted the importance of client focus and a team approach to quality improvement.

Introduction to Quality Improvement Essentials

Excellence Foundation for South Sudan

We all have good and bad thoughts from time to time and situation to situation. We are bombarded daily with spiraling thoughts(both negative and positive) creating all-consuming feel , making us difficult to manage with associated suffering. Good thoughts are like our Mob Signal (Positive thought) amidst noise(negative thought) in the atmosphere. Negative thoughts like noise outweigh positive thoughts. These thoughts often create unwanted confusion, trouble, stress and frustration in our mind as well as chaos in our physical world. Negative thoughts are also known as “distorted thinking”.

How to Break the cycle of negative Thoughts

Col Mukteshwar Prasad

PART A. Introduction to Costumer Service

PedroFerreira53928

slides CapTechTalks Webinar May 2024 Alexander Perry.pptx

CapitolTechU

Fish and Chips - have they had their chips

GeoBlogs

The Art Pastor's Guide to Sabbath | Steve Thomason

Steve Thomason

Advances in production technology of Grapes.pdf

Dr. M. Kumaresan Hort.

Operations Management - Book1.p - Dr. Abdulfatah A. Salem

Arab Academy for Science, Technology and Maritime Transport

Mattingly "AI & Prompt Design: Limitations and Solutions with LLMs"

National Information Standards Organization (NISO)

Matatag-Curriculum and the 21st Century Skills Presentation.pptx

JenilouCasareno

Basic phrases for greeting and assisting costumers

PedroFerreira53928

aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

siemaillard

Solid waste management & Types of Basic civil Engineering notes by DJ Sir Types of SWM Liquid wastes Gaseous wastes Solid wastes. CLASSIFICATION OF SOLID WASTE: Based on their sources of origin Based on physical nature SYSTEMS FOR SOLID WASTE MANAGEMENT: METHODS FOR DISPOSAL OF THE SOLID WASTE: OPEN DUMPS: LANDFILLS: Sanitary landfills COMPOSTING Different stages of composting VERMICOMPOSTING: Vermicomposting process: Encapsulation: Incineration MANAGEMENT OF SOLID WASTE: Refuse Reuse Recycle Reduce FACTORS AFFECTING SOLID WASTE MANAGEMENT:

Solid waste management & Types of Basic civil Engineering notes by DJ Sir.pptx

Denish Jangid

This presentation provides an introduction to quantitative trait loci (QTL) analysis and marker-assisted selection (MAS) in plant breeding. The presentation begins by explaining the type of quantitative traits. The process of QTL analysis, including the use of molecular genetic markers and statistical methods, is discussed. Practical examples demonstrating the power of MAS are provided, such as its use in improving crop traits in plant breeding programs. Overall, this presentation offers a comprehensive overview of these important genomics-based approaches that are transforming modern agriculture.

Basic_QTL_Marker-assisted_Selection_Sourabh.ppt

Sourabh Kumar

The Indian economy is classified into different sectors to simplify the analysis and understanding of economic activities. For Class 10, it's essential to grasp the sectors of the Indian economy, understand their characteristics, and recognize their importance. This guide will provide detailed notes on the Sectors of the Indian Economy Class 10, using specific long-tail keywords to enhance comprehension. For more information, visit-www.vavaclasses.com

Sectors of the Indian Economy - Class 10 Study Notes pdf

Vivekanand Anglo Vedic Academy

INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdf

bu07226

Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf

Po-Chuan Chen

Basic Civil Engineering Notes of Chapter-6, Topic- Ecosystem, Biodiversity Green house effect & Hydrological cycle Types of Ecosystem (1) Natural Ecosystem (2) Artificial Ecosystem component of ecosystem Biotic Components Abiotic Components Producers Consumers Decomposers Functions of Ecosystem Types of Biodiversity Genetic Biodiversity Species Biodiversity Ecological Biodiversity Importance of Biodiversity Hydrological Cycle Green House Effect

Basic Civil Engineering Notes of Chapter-6, Topic- Ecosystem, Biodiversity G...

Denish Jangid

Gyanartha SciBizTech Quiz slideshare.pptx

Shibin Azad

🌲🦁 Forest and wildlife resources are the lifelines of our planet, nurturing biodiversity, maintaining ecological balance, and supporting countless life forms. 🌍 Understanding their significance is key to preserving our natural heritage sustainably. 🌿 Join me as we delve into the types, importance, threats, and conservation measures of these vital resources in our class 10 journey! 📚 Let's protect and cherish our forests and wildlife for generations to come. 🌳🐾 For more information, visit-www.vavaclasses.com

Forest and Wildlife Resources Class 10 Free Study Material PDF

Vivekanand Anglo Vedic Academy

Recently uploaded (20)

Introduction to Quality Improvement Essentials

How to Break the cycle of negative Thoughts

PART A. Introduction to Costumer Service

slides CapTechTalks Webinar May 2024 Alexander Perry.pptx

Fish and Chips - have they had their chips

The Art Pastor's Guide to Sabbath | Steve Thomason

Advances in production technology of Grapes.pdf

Operations Management - Book1.p - Dr. Abdulfatah A. Salem

Mattingly "AI & Prompt Design: Limitations and Solutions with LLMs"

Matatag-Curriculum and the 21st Century Skills Presentation.pptx

Basic phrases for greeting and assisting costumers

aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

Solid waste management & Types of Basic civil Engineering notes by DJ Sir.pptx

Basic_QTL_Marker-assisted_Selection_Sourabh.ppt

Sectors of the Indian Economy - Class 10 Study Notes pdf

INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdf

Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf

Basic Civil Engineering Notes of Chapter-6, Topic- Ecosystem, Biodiversity G...

Gyanartha SciBizTech Quiz slideshare.pptx

Forest and Wildlife Resources Class 10 Free Study Material PDF

GoogLeNet.pptx

1. Min-Seo Kim Network Science Lab Dept. of Artificial Intelligence The Catholic University of Korea E-mail: kms39273@naver.com

2. 1  Ongoing studies • GoogLeNet

3. 2 GoogLeNet • The GoogLeNet submission to ILSVRC 2014 used 12× fewer parameters than the winning architecture, VGG, from two years prior, yet it was significantly more accurate. • Notable factor is that with the ongoing traction of mobile and embedded computing, the efficiency of our algorithms – especially their power and memory use – gains importance. Introduction

4. 3 GoogLeNet • GoogLeNet have typically had a standard structure – stacked convolutional layers (optionally followed by contrast normalization and maxpooling) are followed by one or more fully-connected layers. • Use 1×1 convolutional layers, ReLU activation function in Network-in-Network. Related Work

5. 4 GoogLeNet • The most straightforward way of improving the performance of deep neural networks is by increasing their size. • However this simple solution comes with two major drawbacks. • Bigger size typically means a larger number of parameters, which makes the enlarged network more prone to overfitting. To prevent overfitting, creation of high quality training sets can be tricky and expensive. • Uniformly increased network size is the dramatically increased use of computational resources. • Since in practice the computational budget is always finite, an efficient distribution of computing resources is preferred to an indiscriminate increase of size Motivation and High Level Considerations

6. 5 GoogLeNet - Architectural Details • To effectively extract feature maps, 1x1, 3x3, and 5x5 convolution filters are performed in parallel. • However, this inevitably increases the computational load. Inception module

7. 6 GoogLeNet - Architectural Details • Therefore, to address this issue, the 1x1 convolution filter was used. • By placing it before the 3x3 and 5x5 filters, it reduces the dimensions, which in turn reduces the computational load and introduces non-linearity. Inception module

8. 7 GoogLeNet - Architectural Details - input tensor = 28X28X192 - convolution filter = 5X5X192 - padding = 2 - strride = 1 - number of filter = 32 28X28X192X5X5X32=1.2 billion times How does the 1x1 conv filter reduce the amount of computation? - input tensor = 28X28X192 - convolution filter = 1X1X16 - number of filter = 16 192X1X1X28X28X16=2.4 million operations - input tensor = 28X28X16 - convolution filter = 5X5X192 - padding = 2 - strride = 1 - number of filter = 32 16x5x5x28x28x32 = 10 million operations Total of 12.4 million operations. The number of operations has decreased tenfold. The non-linearity has increased.

9. 8 GoogLeNet - Architectural Details • This is the parameter calculation for the Inception 3a module inside the actual GoogLeNet. Inception in GoogLeNet(inception 3a)

10. 9 GoogLeNet - Architectural Details Entire GoogLeNet

11. 10 GoogLeNet - Architectural Details • This is where the lower layers are located, close to the input image. • For efficient memory usage, we applied a basic CNN-type model in the lower layer. • The Inception module is used in the higher layers, so it is not used in this part. Part 1

12. 11 GoogLeNet - Architectural Details • To extract various features, the Inception module described earlier is implemented. Part 2

13. 12 GoogLeNet - Architectural Details • As the depth of the model becomes very deep, the vanishing gradient problem can occur even when using the ReLU activation function. • We added an auxiliary classifier to the middle layer, which outputs intermediate results so that the gradient can be passed as an additional backprop. • To prevent it from having too much influence, the loss of the auxiliary classifier is multiplied by 0.3 and added to the total loss of the entire network. • In the actual test, we removed the auxiliary classifier and used only the softmax of the far end. Part 3

14. 13 GoogLeNet - Architectural Details • This is the end of the model with the prediction results. • The average pooling layer with global average pooling is applied. • This reduces the size of the feature map without any additional parameters. Part 4

15. 14 GoogLeNet • We presented a new methodology that is different from the existing CNN methods that only build up depth. • It won the first prize at ILSVRC 2014, beating VGGNet. Conclusions

GoogLeNet.pptx

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to GoogLeNet.pptx

Similar to GoogLeNet.pptx (20)

More from ssuser2624f71

More from ssuser2624f71 (20)

Recently uploaded

Recently uploaded (20)

GoogLeNet.pptx