SlideShare a Scribd company logo
1 of 33
‫لهيكلية‬ ‫تفصيلي‬ ‫شرح‬
YOLOv8
Ahmed R. A. Shamsan
YOLOv8 architecture Details | Blocks
Conv
Block: the
most
common
used Block
• 2D Conv layer
• Batch normalization 2D
• SiLU Activation function
C2F Block
• Conv Blocks
• Bottleneck Blocks
SPPF Block
• Conv Blocks
• 3 2D maxpooling Layers
• Conv Block
Detect
Block
• Where The Detection Happen
I will explain YOLO architecture before explaining the whole YOLOv8 architecture I will explain the details of YOLOv8 blocks
they are all used together into a
single convolutional block
video I will explain YOLO architecture before explaining the
whole YOLO f8 architecture I will explain details of YOLOv8
blocks so that you can understand more easily, the most
commonly used block is the convolutional Block in YOLOv8 a
convolutional block consists of a 2d convolutional layer a 2d
beds normalization and a SiLU activation function
The second block is the C2F block
• contains a convolutional block which then the resulting
feature Maps will be split one goes to the bottleneck block
• whereas the other goes directly into the concet Block
• in the C2F Block we can have many bottleneck blocks at the
end there is another convolutional block.
• bottleneck itself is a sequence of convolutional blocks with a
shortcut
• if you are familiar with the ResNet block Bottleneck block is
pretty similar to the ResNet block the difference is that there
is a bottleneck without a shortcut
The third block is the SPPF block
SPPF stands for SPATIAL PYRAMID POOLING FAST
it is a modification of SP of SPATIAL PYRAMID POOLING
with a higher speed
inside the SPPF there are
• convolutional block at the beginning and
• followed by 3 2D MAX POOLLING Layers
• the interesting part is that every resulting Feature map is
concatenated right before the end of SPPF is ends with a
convolution of block
The Last Block Is The Detect Block
• this is where the detection happens
• Differently from the previous YOLO version YOLOv8 is an anchor-free model
• the predictions happen in the grid cell.
• the detect block contains two tracks
1. the first track is for bonding box prediction
2. whereas the other is for class prediction
• both tracks have the same block sequence which is two convolutional blocks and a single
2D convolutional layer
THE FUNDAMENTAL COMPONENTS OF A CONVOLUTIONAL NEURAL NETWORK | [1] KERNEL
• the kernel is a two-dimensional array kernels are usually
called feature detectors
• the value in the kernel is weights that can be updated
during the training process
• kernel will move across the image and perform a DOT
operation between the input and the value of the kernel to
produce an Output
• The output is also known as a feature map
THE FUNDAMENTAL COMPONENTS OF A CONVOLUTIONAL NEURAL NETWORK | [2] STRID
• The Stride is defined as the displacement distance
during the convolution process.
• the smaller the resulting output the larger the strip
convolution with stride one is demonstrated in this
example
THE FUNDAMENTAL COMPONENTS OF A CONVOLUTIONAL NEURAL NETWORK | [3] PADDING
• The padding adds value to the uttermost element of the image in other words,
there are several types of padding
THE FUNDAMENTAL COMPONENTS OF A CONVOLUTIONAL NEURAL NETWORK | [3] PADDING
• The padding adds value to the uttermost element of the image in other words, there are
several types of padding
1. The zeros padding is the default padding type in the zeros padding the pended pick
will have a value of zero.
THE FUNDAMENTAL COMPONENTS OF A CONVOLUTIONAL NEURAL NETWORK | [3] PADDING
• the padding adds value to the uttermost element of the image in other words, there are
several types of padding
2. the replication padding the padded pixels will have the same value as the closest real
pixel the panic corners will have the same value as the real Corners
The
YOLOv8
architecture
Backbone
is the Deep learning architecture that
basically acts as a feature extractor
Neck
combines the features acquired from
the various layers of the backbone
model
Head
predicts the classes and bounding box
regions which is the final output
produced by the object detection
model
The YOLOv8 architecture
However, the neck is not explicitly mentioned in YOLOv8 the term neck is only written in the official YOLOv8
documentation
on the YOLOv8 architecture file yolo. yml there are only two
parts the backbone and the head
Next, I will explain the whole YOLOv8
Architecture This architecture drawing is based
on the YOLOv8 architecture file YOLOv8. yml
which is located in the model’s v8 folder it is
also heavily inspired by the drawing from
Range King a GitHub user who posted an issue
in the Yolo GitHub repository we made some
modifications to the drawing to make it more
readable and align with the yo8 source code
itself the explanation of the architecture
• Begins With An Explanation Of The Three Parameters That
Define The Yolov8 These Parameters Are
• Depth Multiple, Width Multiple And Max Channels The
Depth Multiple Parameters Determine How Many Bottleneck
Blocks Are In The C2F Block
• The Width Multiple And Max Channels Parameters Determine
The Output Channel
• The YOLO At The Input Is An Image With Three Channels
• Next The Backbone The Name Of The Backbone In Yolo Is Not
Stated Directly On The Backbone Each Backbone Is Made Up
Of Numerous Convolution Layers That Extract Distinct Features
At Various Resolution Levels
Before Continuing On The Explanation Of The Layers On The Backbone I Will Explain The Numbering On The Yolov8 Architecture Each
Numbering Is Based On The Architecture File Which Is Yolo.yml, numbering starts from the backbone section and starts from zero for example
this convolution block is the first block in the architecture so we assign it to the number zero and draw the block is s below this numbering
continues until the last c2f block
This backbone begins with
• Two convolutional blocks with
• kernel size three, stride size two,
and padding one
• The special resolution of the output is
reduced when stride two is used.
• for example if the input resolution in
the first convolution of the block is
640x 640 the output resolution after
processing will be 320 by 320.
• To obtain the output Channel use the
following formula this formula is
obtained for the code in the tasks.
• To obtain the output Channel use the following
formula this formula is obtained for the code in
the tasks.
• to obtain the output Channel use the following
formula this formula is obtained for the code in
the tasks. py
(1)
(2)
• First we find the minimum value between the
base output Channel and Max channels the
minimum value is then multiplied by the width
multiple parameters
• For example we will calculate the first
convolution of the blocks output Channel using
the YOLOv8 variant with a width multiple of
one and a Max channels of 512.
• The base output channel in the first
convolutional block is 64 so here is the
calculation first we find the minimum value
between 64 and 512 then multiply by 1 the
result is 64. 64 is the output channel in the first
convolutional block.
• If you use the YOLO you can analyze the
second convolutional block in the same way as
the first one
• First we find the minimum value between the
base output Channel and Max channels the
minimum value is then multiplied by the width
multiple parameters
• For example we will calculate the first
convolution of the blocks output Channel using
the YOLOv8 variant with a width multiple of
one and a Max channels of 512.
• The base output channel in the first
convolutional block is 64 so here is the
calculation first we find the minimum value
between 64 and 512 then multiply by 1 the
result is 64. 64 is the output channel in the first
convolutional block.
• If you use the YOLO you can analyze the
second convolutional block in the same way as
the first one
• The base output channel in the first convolutional block is 64 so here is the calculation first we find the minimum value
between 64 and 512 then multiply by 1 the result is 64. 64 is the output channel in the first convolutional block.
• If you use the YOLO you can analyze the second convolutional block in the same way as the first one
W= width multiple
mc= Max channels
Next is THE C2F BLOCK
Contains two parameters shortcut and N
• The shortcut parameter in this block is true indicating that
the shortcut will be used on the bottleneck block
• Whereas n determines how many bottleneck blocks are used
• The N value is calculated by multiplying the depth multiple
value by 3.
• Next is another convolutional block with a kernel=3,
strip=2 and padding =1 .
• The C2F BLOCK comes next with the shortcut
parameter=True and N = 6 multiplied by the depth multiple.
• The output of this block is also connected to the concat
block
• Next is another convolutional block with a kernel =3 stride
=2 and padding =1
• And then another C2F BLOCK with the shortcut =True and
N =6 multiped by the depth multiple this Block's output is
also connected to the concat block
• Next there is another convolutional block with a kernel
=3,stride =2 and padding=1
• After that there is C2F BLOCK with the shortcut =True and
N=3 multiplied by the depth multiple this block will be
connected to SPPF
SPATIAL PYRAMID POOLING FAST is used after
the last convolution layer on the backbone the main
function of the SPPF is to generate a fixed feature
representation of objects of various sizes in an image
without resizing the image or introducing special
information loss
The neck first there is
• The upsampling layer this layer is used to increase the
feature map resolution of the SPPF to match with the
feature map resolution of this C2F block
• The upsample feature map will be combined with the
features from this C2F block using CONCAP when
using CONCAP the number of channels is summed up
whereas the resolution is unchanged
• For example, we will compute the concatenation of this
C2F block feature map and this upsample feature Map
we use the YOLOv8 varient the output of this C2F
block is 40x 40 x 512 and the upsample output is 40x
40 x 512 the result of the concatenation is 40x40 by
1,24
the following is C2F BLOCK (12) on the neck C2F BLOCK does not employ a shortcut and the value of N=3 multiplied by the
depth multiple
the resolution of the C2F BLOCK feature map will be upsampled (13) to match the resolution of the feature map of this C2F
BLOCK(4) using CONCAP(14) the upsample feature map will be combined with the features from this C2F BLOCK(12)
Next, there is another C2F BLOCK(15) this block will reduce the channel size of the feature map. the feature map of this block
will be used as an input for the DETECT BLOCK this detect block is specialized for detecting small objects the output of this
block is also used as an input to this convolutional block(16,P3) the convolutional block uses a kernel =3 stride=2 and
padding =1, the resolution of the feature map will be reduced by half using this block furthermore CONCAT(17) will be used
to combine the feature map from this convolutional block(16, P3) with the feature map from C2F BLOCK(12)
Next, there is another C2F BLOCK (18) this block will
reduce the channel size of the feature map.
the feature map of this block will be used as input for the
DETECT BLOCK this detect block is specialized for
detecting medium-sized objects
the output of this block is also used as input to this
convolutional block(19) the convolutional block uses a
kernel=3, stride =2 and padding=1.
next CONCAT(20) will be used to combine the feature map
from convolutional block(19) with the feature map from
SPPF(9) block
finally there is another C2F BLOCK (31) this Block's feature
map will be utilized as an input for the DETECT BLOCK
this detect block is specialized for detecting large
objects that's all the explanation about YOLO
architecture.

More Related Content

Similar to شرح تفصيلي لهندسة YOLOv8 - انهيار كامل.pptx

Wits presentation 6_28072015
Wits presentation 6_28072015Wits presentation 6_28072015
Wits presentation 6_28072015Beatrice van Eden
 
Review: You Only Look One-level Feature
Review: You Only Look One-level FeatureReview: You Only Look One-level Feature
Review: You Only Look One-level FeatureDongmin Choi
 
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018Universitat Politècnica de Catalunya
 
Implementation of quantum gates using verilog
Implementation of quantum gates using verilogImplementation of quantum gates using verilog
Implementation of quantum gates using verilogShashank Kumar
 
Out-of-Core Construction of Sparse Voxel Octrees
Out-of-Core Construction of Sparse Voxel OctreesOut-of-Core Construction of Sparse Voxel Octrees
Out-of-Core Construction of Sparse Voxel OctreesJeroen Baert
 
U-Netpresentation.pptx
U-Netpresentation.pptxU-Netpresentation.pptx
U-Netpresentation.pptxNoorUlHaq47
 
A brief introduction to recent segmentation methods
A brief introduction to recent segmentation methodsA brief introduction to recent segmentation methods
A brief introduction to recent segmentation methodsShunta Saito
 
build a Convolutional Neural Network (CNN) using TensorFlow in Python
build a Convolutional Neural Network (CNN) using TensorFlow in Pythonbuild a Convolutional Neural Network (CNN) using TensorFlow in Python
build a Convolutional Neural Network (CNN) using TensorFlow in PythonKv Sagar
 
Sayyad slides ase13_v4
Sayyad slides ase13_v4Sayyad slides ase13_v4
Sayyad slides ase13_v4CS, NcState
 
A Simple Communication System Design Lab #1 with MATLAB Simulink
A Simple Communication System Design Lab #1 with MATLAB Simulink A Simple Communication System Design Lab #1 with MATLAB Simulink
A Simple Communication System Design Lab #1 with MATLAB Simulink Jaewook. Kang
 
NAND and NOR implementation and Other two level implementation
NAND and NOR implementation and  Other two level implementationNAND and NOR implementation and  Other two level implementation
NAND and NOR implementation and Other two level implementationMuhammad Akhtar
 
Trivium Cipher Design
Trivium Cipher DesignTrivium Cipher Design
Trivium Cipher DesignMohid Nabil
 
Modern Convolutional Neural Network techniques for image segmentation
Modern Convolutional Neural Network techniques for image segmentationModern Convolutional Neural Network techniques for image segmentation
Modern Convolutional Neural Network techniques for image segmentationGioele Ciaparrone
 
PR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
PR-108: MobileNetV2: Inverted Residuals and Linear BottlenecksPR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
PR-108: MobileNetV2: Inverted Residuals and Linear BottlenecksJinwon Lee
 

Similar to شرح تفصيلي لهندسة YOLOv8 - انهيار كامل.pptx (20)

Wits presentation 6_28072015
Wits presentation 6_28072015Wits presentation 6_28072015
Wits presentation 6_28072015
 
YOLO
YOLOYOLO
YOLO
 
Review: You Only Look One-level Feature
Review: You Only Look One-level FeatureReview: You Only Look One-level Feature
Review: You Only Look One-level Feature
 
B.tech_project_ppt.pptx
B.tech_project_ppt.pptxB.tech_project_ppt.pptx
B.tech_project_ppt.pptx
 
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
 
Implementation of quantum gates using verilog
Implementation of quantum gates using verilogImplementation of quantum gates using verilog
Implementation of quantum gates using verilog
 
Out-of-Core Construction of Sparse Voxel Octrees
Out-of-Core Construction of Sparse Voxel OctreesOut-of-Core Construction of Sparse Voxel Octrees
Out-of-Core Construction of Sparse Voxel Octrees
 
U-Netpresentation.pptx
U-Netpresentation.pptxU-Netpresentation.pptx
U-Netpresentation.pptx
 
3-Block Ciphers and DES.pdf
3-Block Ciphers and DES.pdf3-Block Ciphers and DES.pdf
3-Block Ciphers and DES.pdf
 
A brief introduction to recent segmentation methods
A brief introduction to recent segmentation methodsA brief introduction to recent segmentation methods
A brief introduction to recent segmentation methods
 
DMS MODULE 1 PRESENTATION.pptx
DMS MODULE 1 PRESENTATION.pptxDMS MODULE 1 PRESENTATION.pptx
DMS MODULE 1 PRESENTATION.pptx
 
build a Convolutional Neural Network (CNN) using TensorFlow in Python
build a Convolutional Neural Network (CNN) using TensorFlow in Pythonbuild a Convolutional Neural Network (CNN) using TensorFlow in Python
build a Convolutional Neural Network (CNN) using TensorFlow in Python
 
Sayyad slides ase13_v4
Sayyad slides ase13_v4Sayyad slides ase13_v4
Sayyad slides ase13_v4
 
A Simple Communication System Design Lab #1 with MATLAB Simulink
A Simple Communication System Design Lab #1 with MATLAB Simulink A Simple Communication System Design Lab #1 with MATLAB Simulink
A Simple Communication System Design Lab #1 with MATLAB Simulink
 
NAND and NOR implementation and Other two level implementation
NAND and NOR implementation and  Other two level implementationNAND and NOR implementation and  Other two level implementation
NAND and NOR implementation and Other two level implementation
 
Yolo
YoloYolo
Yolo
 
3A.ppt
3A.ppt3A.ppt
3A.ppt
 
Trivium Cipher Design
Trivium Cipher DesignTrivium Cipher Design
Trivium Cipher Design
 
Modern Convolutional Neural Network techniques for image segmentation
Modern Convolutional Neural Network techniques for image segmentationModern Convolutional Neural Network techniques for image segmentation
Modern Convolutional Neural Network techniques for image segmentation
 
PR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
PR-108: MobileNetV2: Inverted Residuals and Linear BottlenecksPR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
PR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
 

More from ِِِAhmed R. A. Shamsan

image processing EdgeDetection Luc03 part 01.pdf
image processing EdgeDetection Luc03 part 01.pdfimage processing EdgeDetection Luc03 part 01.pdf
image processing EdgeDetection Luc03 part 01.pdfِِِAhmed R. A. Shamsan
 
digital image enhancement techniques and applcations.pdf
digital image enhancement techniques and applcations.pdfdigital image enhancement techniques and applcations.pdf
digital image enhancement techniques and applcations.pdfِِِAhmed R. A. Shamsan
 
Image Edge Detection Operators in Digital Image Processing _ L1.pdf
Image Edge Detection Operators in Digital Image Processing _ L1.pdfImage Edge Detection Operators in Digital Image Processing _ L1.pdf
Image Edge Detection Operators in Digital Image Processing _ L1.pdfِِِAhmed R. A. Shamsan
 
Ms powerpoint بالعربي شرح ميكروسوفت باوربويت العرض التقديمي
Ms powerpoint بالعربي شرح ميكروسوفت باوربويت العرض التقديميMs powerpoint بالعربي شرح ميكروسوفت باوربويت العرض التقديمي
Ms powerpoint بالعربي شرح ميكروسوفت باوربويت العرض التقديميِِِAhmed R. A. Shamsan
 

More from ِِِAhmed R. A. Shamsan (20)

image processing EdgeDetection Luc03 part 01.pdf
image processing EdgeDetection Luc03 part 01.pdfimage processing EdgeDetection Luc03 part 01.pdf
image processing EdgeDetection Luc03 part 01.pdf
 
image processing_ Edge Detection Luc02.pdf
image processing_ Edge Detection Luc02.pdfimage processing_ Edge Detection Luc02.pdf
image processing_ Edge Detection Luc02.pdf
 
image processing _Edge Detection Luc01.pdf
image processing _Edge Detection Luc01.pdfimage processing _Edge Detection Luc01.pdf
image processing _Edge Detection Luc01.pdf
 
digital image enhancement techniques and applcations.pdf
digital image enhancement techniques and applcations.pdfdigital image enhancement techniques and applcations.pdf
digital image enhancement techniques and applcations.pdf
 
Image Edge Detection Operators in Digital Image Processing _ L1.pdf
Image Edge Detection Operators in Digital Image Processing _ L1.pdfImage Edge Detection Operators in Digital Image Processing _ L1.pdf
Image Edge Detection Operators in Digital Image Processing _ L1.pdf
 
Intorduction to databases 2021
Intorduction to databases 2021Intorduction to databases 2021
Intorduction to databases 2021
 
5 sql language
5   sql language5   sql language
5 sql language
 
4 sql language
4   sql language4   sql language
4 sql language
 
Computer skills 2019 last edition a
Computer skills 2019 last edition aComputer skills 2019 last edition a
Computer skills 2019 last edition a
 
Ms powerpoint بالعربي شرح ميكروسوفت باوربويت العرض التقديمي
Ms powerpoint بالعربي شرح ميكروسوفت باوربويت العرض التقديميMs powerpoint بالعربي شرح ميكروسوفت باوربويت العرض التقديمي
Ms powerpoint بالعربي شرح ميكروسوفت باوربويت العرض التقديمي
 
Ms excel
Ms excel  Ms excel
Ms excel
 
Ms word
Ms word Ms word
Ms word
 
Ms windows 7
Ms windows 7Ms windows 7
Ms windows 7
 
Internet basices
Internet basices Internet basices
Internet basices
 
dos fundamentals
dos fundamentalsdos fundamentals
dos fundamentals
 
Queues and linked lists
Queues and linked listsQueues and linked lists
Queues and linked lists
 
Linked list
Linked listLinked list
Linked list
 
10 introduction to ds 2-2019 - heap
10   introduction to ds 2-2019 - heap10   introduction to ds 2-2019 - heap
10 introduction to ds 2-2019 - heap
 
introduction to ds 2-2019 - complete tree
 introduction to ds 2-2019 - complete tree introduction to ds 2-2019 - complete tree
introduction to ds 2-2019 - complete tree
 
introduction to binary search trees
introduction to binary search treesintroduction to binary search trees
introduction to binary search trees
 

Recently uploaded

Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...Pooja Nehwal
 
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...anjaliyadav012327
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 

Recently uploaded (20)

Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
 
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 

شرح تفصيلي لهندسة YOLOv8 - انهيار كامل.pptx

  • 2. YOLOv8 architecture Details | Blocks Conv Block: the most common used Block • 2D Conv layer • Batch normalization 2D • SiLU Activation function C2F Block • Conv Blocks • Bottleneck Blocks SPPF Block • Conv Blocks • 3 2D maxpooling Layers • Conv Block Detect Block • Where The Detection Happen I will explain YOLO architecture before explaining the whole YOLOv8 architecture I will explain the details of YOLOv8 blocks they are all used together into a single convolutional block
  • 3.
  • 4.
  • 5. video I will explain YOLO architecture before explaining the whole YOLO f8 architecture I will explain details of YOLOv8 blocks so that you can understand more easily, the most commonly used block is the convolutional Block in YOLOv8 a convolutional block consists of a 2d convolutional layer a 2d beds normalization and a SiLU activation function
  • 6. The second block is the C2F block • contains a convolutional block which then the resulting feature Maps will be split one goes to the bottleneck block • whereas the other goes directly into the concet Block • in the C2F Block we can have many bottleneck blocks at the end there is another convolutional block. • bottleneck itself is a sequence of convolutional blocks with a shortcut • if you are familiar with the ResNet block Bottleneck block is pretty similar to the ResNet block the difference is that there is a bottleneck without a shortcut
  • 7. The third block is the SPPF block SPPF stands for SPATIAL PYRAMID POOLING FAST it is a modification of SP of SPATIAL PYRAMID POOLING with a higher speed inside the SPPF there are • convolutional block at the beginning and • followed by 3 2D MAX POOLLING Layers • the interesting part is that every resulting Feature map is concatenated right before the end of SPPF is ends with a convolution of block
  • 8. The Last Block Is The Detect Block • this is where the detection happens • Differently from the previous YOLO version YOLOv8 is an anchor-free model • the predictions happen in the grid cell. • the detect block contains two tracks 1. the first track is for bonding box prediction 2. whereas the other is for class prediction • both tracks have the same block sequence which is two convolutional blocks and a single 2D convolutional layer
  • 9. THE FUNDAMENTAL COMPONENTS OF A CONVOLUTIONAL NEURAL NETWORK | [1] KERNEL • the kernel is a two-dimensional array kernels are usually called feature detectors • the value in the kernel is weights that can be updated during the training process • kernel will move across the image and perform a DOT operation between the input and the value of the kernel to produce an Output • The output is also known as a feature map
  • 10. THE FUNDAMENTAL COMPONENTS OF A CONVOLUTIONAL NEURAL NETWORK | [2] STRID • The Stride is defined as the displacement distance during the convolution process. • the smaller the resulting output the larger the strip convolution with stride one is demonstrated in this example
  • 11. THE FUNDAMENTAL COMPONENTS OF A CONVOLUTIONAL NEURAL NETWORK | [3] PADDING • The padding adds value to the uttermost element of the image in other words, there are several types of padding
  • 12. THE FUNDAMENTAL COMPONENTS OF A CONVOLUTIONAL NEURAL NETWORK | [3] PADDING • The padding adds value to the uttermost element of the image in other words, there are several types of padding 1. The zeros padding is the default padding type in the zeros padding the pended pick will have a value of zero.
  • 13. THE FUNDAMENTAL COMPONENTS OF A CONVOLUTIONAL NEURAL NETWORK | [3] PADDING • the padding adds value to the uttermost element of the image in other words, there are several types of padding 2. the replication padding the padded pixels will have the same value as the closest real pixel the panic corners will have the same value as the real Corners
  • 14. The YOLOv8 architecture Backbone is the Deep learning architecture that basically acts as a feature extractor Neck combines the features acquired from the various layers of the backbone model Head predicts the classes and bounding box regions which is the final output produced by the object detection model
  • 16. However, the neck is not explicitly mentioned in YOLOv8 the term neck is only written in the official YOLOv8 documentation
  • 17. on the YOLOv8 architecture file yolo. yml there are only two parts the backbone and the head
  • 18. Next, I will explain the whole YOLOv8 Architecture This architecture drawing is based on the YOLOv8 architecture file YOLOv8. yml which is located in the model’s v8 folder it is also heavily inspired by the drawing from Range King a GitHub user who posted an issue in the Yolo GitHub repository we made some modifications to the drawing to make it more readable and align with the yo8 source code itself the explanation of the architecture
  • 19. • Begins With An Explanation Of The Three Parameters That Define The Yolov8 These Parameters Are • Depth Multiple, Width Multiple And Max Channels The Depth Multiple Parameters Determine How Many Bottleneck Blocks Are In The C2F Block • The Width Multiple And Max Channels Parameters Determine The Output Channel • The YOLO At The Input Is An Image With Three Channels • Next The Backbone The Name Of The Backbone In Yolo Is Not Stated Directly On The Backbone Each Backbone Is Made Up Of Numerous Convolution Layers That Extract Distinct Features At Various Resolution Levels
  • 20. Before Continuing On The Explanation Of The Layers On The Backbone I Will Explain The Numbering On The Yolov8 Architecture Each Numbering Is Based On The Architecture File Which Is Yolo.yml, numbering starts from the backbone section and starts from zero for example this convolution block is the first block in the architecture so we assign it to the number zero and draw the block is s below this numbering continues until the last c2f block
  • 21. This backbone begins with • Two convolutional blocks with • kernel size three, stride size two, and padding one • The special resolution of the output is reduced when stride two is used. • for example if the input resolution in the first convolution of the block is 640x 640 the output resolution after processing will be 320 by 320. • To obtain the output Channel use the following formula this formula is obtained for the code in the tasks.
  • 22. • To obtain the output Channel use the following formula this formula is obtained for the code in the tasks. • to obtain the output Channel use the following formula this formula is obtained for the code in the tasks. py (1) (2)
  • 23. • First we find the minimum value between the base output Channel and Max channels the minimum value is then multiplied by the width multiple parameters • For example we will calculate the first convolution of the blocks output Channel using the YOLOv8 variant with a width multiple of one and a Max channels of 512. • The base output channel in the first convolutional block is 64 so here is the calculation first we find the minimum value between 64 and 512 then multiply by 1 the result is 64. 64 is the output channel in the first convolutional block. • If you use the YOLO you can analyze the second convolutional block in the same way as the first one
  • 24. • First we find the minimum value between the base output Channel and Max channels the minimum value is then multiplied by the width multiple parameters • For example we will calculate the first convolution of the blocks output Channel using the YOLOv8 variant with a width multiple of one and a Max channels of 512. • The base output channel in the first convolutional block is 64 so here is the calculation first we find the minimum value between 64 and 512 then multiply by 1 the result is 64. 64 is the output channel in the first convolutional block. • If you use the YOLO you can analyze the second convolutional block in the same way as the first one
  • 25. • The base output channel in the first convolutional block is 64 so here is the calculation first we find the minimum value between 64 and 512 then multiply by 1 the result is 64. 64 is the output channel in the first convolutional block. • If you use the YOLO you can analyze the second convolutional block in the same way as the first one W= width multiple mc= Max channels
  • 26. Next is THE C2F BLOCK Contains two parameters shortcut and N • The shortcut parameter in this block is true indicating that the shortcut will be used on the bottleneck block • Whereas n determines how many bottleneck blocks are used • The N value is calculated by multiplying the depth multiple value by 3. • Next is another convolutional block with a kernel=3, strip=2 and padding =1 . • The C2F BLOCK comes next with the shortcut parameter=True and N = 6 multiplied by the depth multiple. • The output of this block is also connected to the concat block • Next is another convolutional block with a kernel =3 stride =2 and padding =1 • And then another C2F BLOCK with the shortcut =True and N =6 multiped by the depth multiple this Block's output is also connected to the concat block • Next there is another convolutional block with a kernel =3,stride =2 and padding=1 • After that there is C2F BLOCK with the shortcut =True and N=3 multiplied by the depth multiple this block will be connected to SPPF
  • 27. SPATIAL PYRAMID POOLING FAST is used after the last convolution layer on the backbone the main function of the SPPF is to generate a fixed feature representation of objects of various sizes in an image without resizing the image or introducing special information loss
  • 28. The neck first there is • The upsampling layer this layer is used to increase the feature map resolution of the SPPF to match with the feature map resolution of this C2F block • The upsample feature map will be combined with the features from this C2F block using CONCAP when using CONCAP the number of channels is summed up whereas the resolution is unchanged • For example, we will compute the concatenation of this C2F block feature map and this upsample feature Map we use the YOLOv8 varient the output of this C2F block is 40x 40 x 512 and the upsample output is 40x 40 x 512 the result of the concatenation is 40x40 by 1,24
  • 29.
  • 30. the following is C2F BLOCK (12) on the neck C2F BLOCK does not employ a shortcut and the value of N=3 multiplied by the depth multiple the resolution of the C2F BLOCK feature map will be upsampled (13) to match the resolution of the feature map of this C2F BLOCK(4) using CONCAP(14) the upsample feature map will be combined with the features from this C2F BLOCK(12)
  • 31. Next, there is another C2F BLOCK(15) this block will reduce the channel size of the feature map. the feature map of this block will be used as an input for the DETECT BLOCK this detect block is specialized for detecting small objects the output of this block is also used as an input to this convolutional block(16,P3) the convolutional block uses a kernel =3 stride=2 and padding =1, the resolution of the feature map will be reduced by half using this block furthermore CONCAT(17) will be used to combine the feature map from this convolutional block(16, P3) with the feature map from C2F BLOCK(12)
  • 32. Next, there is another C2F BLOCK (18) this block will reduce the channel size of the feature map. the feature map of this block will be used as input for the DETECT BLOCK this detect block is specialized for detecting medium-sized objects the output of this block is also used as input to this convolutional block(19) the convolutional block uses a kernel=3, stride =2 and padding=1. next CONCAT(20) will be used to combine the feature map from convolutional block(19) with the feature map from SPPF(9) block finally there is another C2F BLOCK (31) this Block's feature map will be utilized as an input for the DETECT BLOCK this detect block is specialized for detecting large
  • 33. objects that's all the explanation about YOLO architecture.

Editor's Notes

  1. في هذا الفيديو سوف نشرح مكونات هيكل YOLOv8 الرئيسية ليسهل فهم الهيكل بشكل عام . و اول قسم في هذا الهيكل هو convolutional Block و هو الأشهر و الأكثر استخداما و هنا يتكون هذا البلوك من ثلاثة اقسام هم كما في الشكل (2D Conv layer و Batch normalization 2D و SiLU Activation function ).
  2. البوك الثاني هو الكتلة C2F يحتوي على convolutional block والتي سيتم بعد ذلك تقسيم feature Maps الناتجة إلى bottleneck block بينما يذهب الآخر مباشرة إلى concet Block في الكتلة C2F يمكن أن يكون لدينا العديد من bottleneck blocks في النهاية هناك convolutional block أخرى. bottleneck نفسه هو سلسلة من convolutional blocks مع shortcut إذا كنت على دراية بكتلة ResNet، فإن كتلة bottleneck تشبه إلى حد كبير كتلة ResNet، فالفرق هو أن هناك bottleneck بدون طريق مختصر
  3. الكتلة الثالثة هي كتلة SPPF ترمز SPPF إلى تجميع الهرم المكاني بسرعة (SPATIAL PYRAMID POOLING FAST ) إنه تعديل SP لتجميع الهرم المكاني(spatial pyramid pooling ) بسرعة أعلى يوجد داخل SPPF كتلة التلافيف convolutional block في البداية و تليها طبقات 3 2D MAX POOLLING الجزء المثير للاهتمام هو أن كل خريطة ميزة Feature map ناتجة يتم اختلاقها مباشرة قبل انتهاء نهاية SPPF بتلافي الكتلة convolution of block
  4. الكتلة الأخيرة هي كتلة الكشف هذا هو المكان الذي يحدث فيه الاكتشاف بشكل مختلف عن الإصدار السابق من YOLO فإن YOLOv8هو نموذج خالٍ من المرساة (anchor-free ) التنبؤات تحدث في خلية الشبكة (grid cell). تحتوي كتلة الكشف على مسارين المسار الأول للتنبؤ بصندوق الترابط (bonding box prediction) بينما الآخر للتنبؤ بالطبقة (class prediction ) كلا المسارين لهما نفس تسلسل الكتلة وهو كتلتان تلافيقيتان (convolutional blocks ) وطبقة تلافيفية ثنائية الأبعاد واحدة (single 2D convolutional layer)
  5. النواة (kernel) عبارة عن مصفوفة ثنائية الأبعاد تسمى عادةً أجهزة الكشف عن الميزات القيمة في النواة هي الأوزان التي يمكن تحديثها أثناء عملية التدريب تتحرك النواة عبر الصورة وتقوم بعملية DOT بين المدخلات وقيمة النواة لإنتاج مخرج يُعرف المخرج أيضًا باسم خريطة الميزة
  6. يتم تعريف Stride على أنه مسافة الإزاحة أثناء عملية الالتفاف. كلما صغر الناتج الناتج كلما زاد التلافي الشريط مع الخطوة الأولى في هذا المثال
  7. تضيف الحشوة (padding) قيمة إلى أقصى عنصر في الصورة بعبارة أخرى، هناك عدة أنواع من الحشوة(padding) حشوة الأصفار (zeros padding ) هذا هو نوع الحشوة الافتراضي في حشوة الأصفار، سيكون للاختيار المزدوج قيمة الصفر. سيكون لحشوة البكسل المبطنة (replication padding ) نفس قيمة أقرب بكسل حقيقي سيكون لزوايا panic نفس قيمة الزوايا الحقيقية
  8. تضيف الحشوة (padding) قيمة إلى أقصى عنصر في الصورة بعبارة أخرى، هناك عدة أنواع من الحشوة(padding) حشوة الأصفار (zeros padding ) هذا هو نوع الحشوة الافتراضي في حشوة الأصفار، سيكون للاختيار المزدوج قيمة الصفر. سيكون لحشوة البكسل المبطنة (replication padding ) نفس قيمة أقرب بكسل حقيقي سيكون لزوايا panic نفس قيمة الزوايا الحقيقية
  9. تضيف الحشوة (padding) قيمة إلى أقصى عنصر في الصورة بعبارة أخرى، هناك عدة أنواع من الحشوة(padding) حشوة الأصفار (zeros padding ) هذا هو نوع الحشوة الافتراضي في حشوة الأصفار، سيكون للاختيار المزدوج قيمة الصفر. سيكون لحشوة البكسل المبطنة (replication padding ) نفس قيمة أقرب بكسل حقيقي سيكون لزوايا panic نفس قيمة الزوايا الحقيقية
  10. بعد ذلك، سأشرح الهندسة المعمارية YOLOv8 بأكملها هذا الرسم المعماري مبني على YOLOv8 ملفات الهندسة المعمارية YOLOv8. yml الموجود في مجلد v8 للنموذج، كما أنه مستوحى بشكل كبير من الرسم من Range King أحد مستخدمي GitHub الذي نشر مشكلة في مستودع Yolo GitHub، قمنا بإجراء بعض التعديلات على الرسم لجعله أكثر قابلية للقراءة ومواءمته مع رمز مصدر YOLOv8 نفسه وشرح الهندسة المعمارية
  11. يبدأ بشرح المعلمات الثلاثة التي تحدد Yolov8 هذه المعلمات قنوات العمق متعددة العرض متعددة و Max تحدد معلمات العمق المتعددة عدد كتل عنق الزجاجة الموجودة في الكتلة C2F تحدد معلمات قنوات العرض المتعددة و الماكس قناة الإخراج YOLO At The Input هي صورة بثلاث قنوات بعد ذلك، لم يتم ذكر اسم العمود الفقري في Yolo مباشرة على العمود الفقري، حيث يتكون كل عمود فقري من العديد من طبقات الالتفاف التي تستخرج ميزات مميزة بمستويات دقة مختلفة
  12. قبل الاستمرار في تفسير الطبقات الموجودة على العمود الفقري، سأشرح الترقيم على العمارة Yolov8، كل ترقيم يعتمد على ملف الهندسة المعمارية وهو Yolo.yml، يبدأ الترقيم من قسم العمود الفقري ويبدأ من الصفر على سبيل المثال، كتلة الالتفاف هذه هي الكتلة الأولى في الهندسة المعمارية، لذلك نخصصها للرقم صفر ونرسم الكتلة التي تقل عن s يستمر هذا الترقيم حتى آخر كتلة c2f . (كما هو موضح في الجهة الكود السوداء)
  13. للحصول على الإخراج (output Channel ) تستخدم القناة الصيغة (1) كما في الشكل يتم الحصول على هذه الصيغة من الكود الموجود في ملف tasks. للحصول على المخرجات (output Channel ) تستخدم الصيغة (2) كما في الشكل يتم الحصول على هذه الصيغة من الكود الموجود في ملفtasks.py شرح الصيغ في الشريحة التالية ....
  14. أولا نوجد القيمة الصغرى بين كل من (output Channel ) و (Max channels ) وبعد ذلك تضرب القيمة الصغرى بـمتغيرات الـ (width multiple) على سبيل المثال، سنقوم بحساب الالتفاف الأول لقناة إخراج الكتل باستخدام متغير YOLOv8 بمضاعف عرض واحد وقنوات ماكس 512. قناة الخرج الأساسية في الكتلة التلافيفية الأولى هي 64 لذا هنا هي الحساب أولاً نجد الحد الأدنى للقيمة بين 64 و 512 ثم نضرب 1 النتيجة هي 64 × 64 هي قناة الخرج في الكتلة التلافيفية الأولى. إذا كنت تستخدم YOLO، يمكنك تحليل الكتلة التلافيفية الثانية بنفس طريقة الكتلة الأولى
  15. أولا نوجد القيمة الصغرى بين كل من (output Channel ) و (Max channels ) وبعد ذلك تضرب القيمة الصغرى بـمتغيرات الـ (width multiple) على سبيل المثال، سنقوم بحساب الالتفاف الأول لقناة إخراج الكتل باستخدام متغير YOLOv8 بمضاعف عرض واحد وقنوات ماكس 512. قناة الخرج الأساسية في الكتلة التلافيفية الأولى هي 64 لذا هنا هي الحساب أولاً نجد الحد الأدنى للقيمة بين 64 و 512 ثم نضرب 1 النتيجة هي 64 × 64 هي قناة الخرج في الكتلة التلافيفية الأولى. إذا كنت تستخدم YOLO، يمكنك تحليل الكتلة التلافيفية الثانية بنفس طريقة الكتلة الأولى
  16. أولا نوجد القيمة الصغرى بين كل من (output Channel ) و (Max channels ) وبعد ذلك تضرب القيمة الصغرى بـمتغيرات الـ (width multiple) على سبيل المثال، سنقوم بحساب الالتفاف الأول لقناة إخراج الكتل باستخدام متغير YOLOv8 بمضاعف عرض واحد وقنوات ماكس 512. قناة الخرج الأساسية في الكتلة التلافيفية الأولى هي 64 لذا هنا هي الحساب أولاً نجد الحد الأدنى للقيمة بين 64 و 512 ثم نضرب 1 النتيجة هي 64 × 64 هي قناة الخرج في الكتلة التلافيفية الأولى. إذا كنت تستخدم YOLO، يمكنك تحليل الكتلة التلافيفية الثانية بنفس طريقة الكتلة الأولى
  17. التالي هو THE C2F BLOCK يحتوي على معاملين الأول (shortcut )و الثاني N معامل shortcut في هذه الكتلة يحمل قيمة True مما يشير إلى أنه سيتم استخدام shortcut على كتلة bottleneck بينماN تحدد عدد كتل bottleneck المستخدمة يتم حساب قيمة N بضرب قيمة depth multiple في 3. التالي هو كتلة تلافيفية convolutional block ثانية مع kernel = 3، strid = 2 و padding = 1. يأتي C2F BLOCK بعد ذلك مع معلمة shortcut = True و N = 6 مضروبة في depth multiple . إخراج هذه الكتلة متصل أيضًا بكتلة concat يأتي بعد ذلك كتلة تلافيفية أخرى l = 3، strid = 2 و padding = 11 ثم كتلة C2F BLOCK أخرى مع shortcut = True و = N 6 مضروبة في depth multiple خرج هذه الكتلة متصلة أيضا بكتلة concat بعد ذلك هناك كتلة تلافيفية convolutional block أخرى مع kernel =3 و strid = 2 و padding = 1 . بعد ذلك هناك C2F BLOCK مع shortcut = True و N = 3 مضروبة في depth multiple هذه الكتلة سيتم توصيلها إلى SPPF
  18. يتم استخدام التجمع الهرمي المكاني السريع (SPATIAL PYRAMID POOLING FAST ) بعد آخر طبقة تلافي على الهيكل ، تتمثل الوظيفة الرئيسية لـ SPPF في إنشاء تمثيل مميز ثابت للأشياء بأحجام مختلفة في صورة دون تغيير حجم الصورة أو إدخال فقدان معلومات خاصة
  19. العنق او عنق الزجاجة : تستخدم هذه الطبقة لزيادة دقة الخريطة المميزة لـ SPPF لتتناسب مع دقة الخريطة المميزة لهذه الكتلة C2F سيتم دمج خريطة ميزة العينة مع الميزات من كتلة C2F هذه باستخدام CONCAP عند استخدام CONCAP يتم تلخيص عدد القنوات بينما الدقة لم تتغير على سبيل المثال، سنقوم بحساب التوفيق بين خريطة ميزة كتلة C2F هذه وهذه الميزة العينية العلوية خريطة نستخدم المتغير YOLOv8 خرج كتلة C2F هذه هو 40 × 40 × 512 والناتج العلوي هو 40 × 40 × 512 نتيجة التوفيق هو 40 × 40 × 1، 24
  20. فيما يلي C2F BLOCK (12) على الرقبة C2F BLOCK لا تستخدم طريقًا مختصرًا وقيمة N = 3 مضروبة في العمق المضاعف سيتم زيادة دقة خريطة ميزة BLOCK C2F (13) لمطابقة دقة خريطة الميزة لهذه C2F BLOCK (4) باستخدام CONCAP (14) سيتم دمج خريطة ميزة العينة مع الميزات من هذا C2F BLOCK (12)
  21. بعد ذلك، هناك كتلة C2F أخرى (15) ستقلل هذه الكتلة من حجم القناة لخريطة الميزة. سيتم استخدام خريطة ميزة هذه الكتلة كمدخل لـ DETECT BLOCK هذه الكتلة للكشف عن الأجسام الصغيرة التي يستخدم إخراج هذه الكتلة أيضًا كمدخل لهذه الكتلة التلافيفية(16,P3) الكتلة التلافيفية تستخدم نواة = 3 خطوات = 2 وحشوة = 1، سيتم تقليل دقة خريطة الميزة بمقدار النصف باستخدام هذه الكتلة بالإضافة إلى CONCAT (17) ستستخدم لدمج خريطة الميزة من هذه الكتلة التلافيفية (16، P3) مع خريطة الميزة من C2F BLOCK (12)
  22. بعد ذلك، هناك كتلة C2F أخرى (18) ستقلل هذه الكتلة من حجم القناة لخريطة الميزة. سيتم استخدام الخريطة المميزة لهذه الكتلة كمدخلات لـ DETECT BLOCK هذه الكتلة للكشف عن الأجسام متوسطة الحجم يستخدم ناتج هذه الكتلة أيضًا كمدخل لهذه الكتلة التلافيفية (19) تستخدم الكتلة التلافيفية نواة = 3 وخطوة = 2 وحشوة = 1. سيتم استخدام CONCAT التالي (20) لدمج خريطة الميزة من الكتلة التلافيفية (19) مع خريطة الميزة من SPPF (9) كتلة أخيرًا، هناك BLOCK C2F آخر (31) سيتم استخدام خريطة ميزة Block هذه كمدخل لكتلة DETECT BLOCK هذه كتلة الكشف المتخصصة لاكتشاف كبيرة