SlideShare a Scribd company logo
1 of 29
Download to read offline
© 2020 StradVision
Designing Bespoke
CNNs for Target Hardware
Woonhyun Nam
Director, Algorithms
StradVision
September, 2020
© 2020 StradVision
StradVision
• We develop deep learning-based perception software for Advanced Driver Assistance
Systems (ADAS) and Autonomous Vehicles
2
For more detailed information, please refer to https://youtu.be/tK7F8yvpiGA and https://stradvision.com/
© 2020 StradVision
Agenda
• Development Process of Deep Learning-based Software
• Scalability Challenge
• Layer Transformations
• Case Studies
• Conclusions
3
© 2020 StradVision
Development Process of
Deep Learning-based Software
4
© 2020 StradVision
Development Process
Repeated for each task (e.g., object detection) and target platform
5
Design Train Deploy Validate
• Reuse existing CNN designs
• Tweak some layers
• Retrain from scratch
even for small changes
• Revalidate from scratch
if output changes
(Pros)
(Cons)
Network Weights Binary Report
© 2020 StradVision
Scalability Challenge
6
© 2020 StradVision
Development Process
Scalability challenge
• Number of networks to maintain grows rapidly
• More than 100 networks become difficult to manage
7
Task Target platform
# of edges = # of networks
© 2020 StradVision
Scalable Development Process
Main idea
• Grouping target platforms based on similar computational characteristics (or
capabilities) with respect to a given CNN transformation
8
Task Target platform
Same group
# of edges = # of networks
© 2020 StradVision
Scalable Development Process
• Design/Train CNNs for each group
• Transform CNNs without retraining within same group
9
Design Train Transform Deploy Validate
Network Weights Binary Report
Network
Binary Report
Same group
Weights
© 2020 StradVision
Scalable Development Process
Cost-effective method
• Increasing computational efficiency
• E.g., lower latency, higher throughput, cheaper hardware
• Increasing engineering efficiency
• E.g., shorter development time, less engineers, more projects
• Increasing economic efficiency
• E.g., lower usage of cloud services for training, lower production-cost
10
© 2020 StradVision
Layer Transformations
11
© 2020 StradVision
Layer Transformations
• Basic operations
• Case studies
• YUV Input
• Fixed-size Kernel Acceleration in Convolution
• No FC Support
• No Custom Layer Support in Quantization
• Block-level Accumulation in Convolution
12
© 2020 StradVision
Layer Transformations
Basic operations
• (Linear) layer fusion
• Conv & Conv → Conv
• Memory layout change
• For Input, Weight, and Output tensors
• Reshape, Transpose
• Zero-filling
13
𝑦 = 𝑎𝑥 + 𝑏 𝑧 = 𝑐𝑦 + 𝑑
𝑧 = (𝑎𝑐)𝑥 + (𝑏𝑐 + 𝑑)
and
→
1 2 3
4 5 6
1 2 3 4 5 6
1 4
2 5
3 6
New weight New bias
Reshape
Transpose
© 2020 StradVision
Case Studies
14
© 2020 StradVision
Case Study – YUV Input
Hypothetical target hardware
• YUV input, instead of RGB
• YUV2RGB equation is given
→ Fusion of YUV2RGB and the first Conv
15
Conv1
RGB Input
New Conv1
YUV Input
Conv1
YUV2RGB
YUV Input
Fusion
𝑅
𝐺
𝐵
=
𝑤𝑅𝑌 𝑤𝑅𝑈 𝑤𝑅𝑉
𝑤𝐺𝑌 𝑤𝐺𝑈 𝑤𝐺𝑉
𝑤𝐵𝑌 𝑤𝐵𝑈 𝑤𝐵𝑉
𝑌 + 𝑏𝑌
𝑈 + 𝑏𝑈
𝑉 + 𝑏𝑉
YUV2RGB equation
© 2020 StradVision
Case Study – YUV Input
Equations for new weights and new bias
16
𝑅
𝐺
𝐵
=
𝑤𝑅𝑌 𝑤𝑅𝑈 𝑤𝑅𝑉
𝑤𝐺𝑌 𝑤𝐺𝑈 𝑤𝐺𝑉
𝑤𝐵𝑌 𝑤𝐵𝑈 𝑤𝐵𝑉
𝑌 + 𝑏𝑌
𝑈 + 𝑏𝑈
𝑉 + 𝑏𝑉
YUV2RGB equation
𝑍 = 𝑤𝑅 𝑤𝐺 𝑤𝐵
𝑅
𝐺
𝐵
+ 𝑏
Conv1 equation
𝑍 = 𝑤𝑅 𝑤𝐺 𝑤𝐵
𝑤𝑅𝑌 𝑤𝑅𝑈 𝑤𝑅𝑉
𝑤𝐺𝑌 𝑤𝐺𝑈 𝑤𝐺𝑉
𝑤𝐵𝑌 𝑤𝐵𝑈 𝑤𝐵𝑉
𝑌
𝑈
𝑉
+ 𝑏 + 𝑤𝑅 𝑤𝐺 𝑤𝐵
𝑤𝑅𝑌 𝑤𝑅𝑈 𝑤𝑅𝑉
𝑤𝐺𝑌 𝑤𝐺𝑈 𝑤𝐺𝑉
𝑤𝐵𝑌 𝑤𝐵𝑈 𝑤𝐵𝑉
𝑏𝑌
𝑏𝑈
𝑏𝑉
New Conv1 equation
New weight New bias
Fusion
© 2020 StradVision
Case Study – Fixed-size Kernel Acceleration in Convolution
Hypothetical target hardware
• Only 5x5 kernel acceleration supported
• Applying zero-filling for smaller kernels
• Computational utilization
• 1x1 Conv ~ 4%, 3x3 Conv ~ 36%
17
w11 w12 w13
w21 w22 w23
w31 w32 w33
0 0 0 0 0
0 w11 w12 w13 0
0 w21 w22 w23 0
0 w31 w32 w33 0
0 0 0 0 0
Zero filling
© 2020 StradVision
1x1 Convolution
4x1 Convolution
Case Study – Fixed-size Kernel Acceleration in Convolution
→ Convert 1x1 Conv into 4(kh)x1(kw) Conv
18
Input
(ic)x(ih)x(iw)
Weight
(oc)x(ic)x(1)x(1)
Output.reshape(oc,oh,ow)
(oc)x(oh)x(ow)
Reshape Reshape Reshape
iw=input width
ih=input height
ic=input channel
kw=kernel width
kh=kernel height
ow=output width
oh=output height
oc=output channel
Input.reshape(ic/4,4,ih*iw)
(ic/4)x(4)x(ih*iw)
Weight.reshape(oc,ic/4,4,1)
(oc)x(ic/4)x(4)x(1)
Output
(oc)x(1)x(oh*ow)
© 2020 StradVision
1x1 Convolution
4x1 Convolution
4
3
2
19 20 21
22 23 24
13 14 15
16 17 18
7 8 9
10 11 12
Case Study – Fixed-size Kernel Acceleration in Convolution
→ Convert 1x1 Conv into 4(kh)x1(kw) Conv
19
1 2 3
4 5 6
1 2 3 4 5 6
7 8 9 10 11 12
13 14 15 16 17 18
19 20 21 22 23 24
iw=3
ih=2
ic=4
iw’=ih*iw=6
ih’=ic=4
ic’=ic/4=1 Input
1
kw=1
kh=1
ic=4
oc=1
1
2
3
4
kw’=1
kh’=4
ic’=ic/4=1 oc’=oc=1
Weight
1 2 3 4 5 6
ow’=iw’=6
oh’=1
oc’=1
Output
1 2 3
4 5 6
ow=3
oh=2
Reshape Reshape Reshape
iw=input width
ih=input height
ic=input channel
kw=kernel width
kh=kernel height
ow=output width
oh=output height
oc=output channel
© 2020 StradVision
Case Study – No FC Support
Hypothetical target hardware
• No support for FC layers, or
• Only single-batched FC layers supported
20
1 2 3
4 5 6
1 4
2 5
3 6
1 2
3 4
ic=3
batch=2
ic=3
oc=2
* =
batch=2
oc=2
Matrix Multiplication in FC layer
Input Weight Output
© 2020 StradVision
Case Study – No FC Support
→ Convert FC into 1x1 Conv
21
1 2 3
4 5 6
1 4
2 5
3 6
1 2
3 4
ic=3
batch=2
ic=3
oc=2
* =
batch=2 oc=2
FC
Input Weight Output
3 6
2 5
1 4
iw’=batch=2
ih’=1
3
2
1
kw’=1
kh’=1
oc’=oc=2
6
5
4 2 4
1 3
ow’=batch=2
oh’=1
Input Weight Output
1x1 Conv
Transpose & Reshape
Input
(batch)x(ic)
Weight
(ic)x(oc)
Output
(batch)x(oc)
* =
transpose(Input)
.reshape(ic,1,batch)
(ic)x(1)x(batch)
transpose(Weight)
.reshape(oc,ic,1,1)
(oc)x(ic)x(1)x(1)
transpose(Output)
.reshape(oc,1,batch)
(oc)x(1)x(batch)
* =
© 2020 StradVision
Case Study – No Custom Layer Support in Quantization
Hypothetical target hardware
• Quantization should be done by a tool provided by hardware vendor
• No support for custom layers
• The tool accepts only image files as quantization inputs
→ Convert input tensors of custom layers into image files
• Useful for two-stage detection networks (E.g., Faster RCNN)
22
© 2020 StradVision
Case Study – No Custom Layer Support in Quantization
Convert 4D tensors into image files
23
19 20 21
22 23 24
13 14 15
16 17 18
7 8 9
10 11 12
1 2 3
4 5 6
w=3
h=2
43 44 45
46 47 48
37 38 39
40 41 42
31 32 33
34 35 36
25 26 27
28 29 30 batch=2
2(batch)x4(c)x2(h)x3(w)
Tensor in Int8 format
1 2 3 4 5 6
7 8 9 10 11 12
13 14 15 16 17 18
19 20 21 22 23 24
25 26 27 28 29 30
31 32 33 34 35 36
37 38 39 40 41 42
43 44 45 46 47 48
w’=h*w=6
h’=batch*c=8
8(h’)x6(w’)
Gray image in Uint8 format
Reshape & Add 128
Subtract 128 & Reshape
Input.reshape(batch*c,h*w) + 128
(batch*c)x(h*w)
Input
(batch)x(c)x(h)x(w)
© 2020 StradVision
Case Study – Block-level Accumulation in Convolution
Hypothetical target hardware
• Accumulating intermediate results of Conv in 8(oc)x8(ic) channels
• For example,
24
Input
24(ic)x32(ih)x32(iw)
Weight
8(oc)x24(ic)x1(kh)x1(kw)
Conv ( , )
Input[0:8,:,:]
8(ic)x32(ih)x32(iw)
Weight[:,0:8,:,:]
8(oc)x8(ic)x1(kh)x1(kw)
Conv8(oc)x8(ic) ( , )
Input[8:16,:,:]
8(ic)x32(ih)x32(iw)
Weight[:,8:16,:,:]
8(oc)x8(ic)x1(kh)x1(kw)
Conv8(oc)x8(ic) ( , )
=
+
Input[16:24,:,:]
8(ic)x32(ih)x32(iw)
Weight[:,16:24,:,:]
8(oc)x8(ic)x1(kh)x1(kw)
Conv8(oc)x8(ic) ( , )
+
© 2020 StradVision
Case Study – Block-level Accumulation in Convolution
Hypothetical target hardware
• Accumulating intermediate results of Conv in 8(oc)x8(ic) channels
• For example,
25
1x1 Convolution
1
2
3
ih*iw
ic=24
Input
A B C
D E F
G H I
J K L
kh=kw=1
ic=24
oc=32
Weight
1A + 2B + 3C
1D + 2E + 3F
1G + 2H + 3I
1J + 2K + 3L
oh*ow
oc=32
Output
© 2020 StradVision
Case Study – Block-level Accumulation in Convolution
→ 8(oc)x8(ic) block-level sparsification
26
1x1 Conv Weight (dense)
ic=64
oc=32
1x1 Conv Weight (53% sparse)
ic=64
oc=32
Weightsparse(i-th block) =
Weightdense(i-th block) otherwise
0 if Σ|Weightdense(i-th block)| < θ
{
© 2020 StradVision
Conclusions
27
© 2020 StradVision
Conclusions
• Layer transformation techniques can reduce production-cost especially
for retraining of networks while supporting a diverse array of target
platforms
• It may also be beneficial for deep learning hardware manufacturers to
support a wide variety of networks not yet natively supported by their
hardware
28
© 2020 StradVision
Resources
29
• Sparsification
• “Exploring the Granularity of Sparsity in Convolution Neural Networks”, CVPR2017
https://openaccess.thecvf.com/content_cvpr_2017_workshops/w29/html/Mao_Ex
ploring_the_Granularity_CVPR_2017_paper.html

More Related Content

More from Edge AI and Vision Alliance

“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...Edge AI and Vision Alliance
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...Edge AI and Vision Alliance
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...Edge AI and Vision Alliance
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...Edge AI and Vision Alliance
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...Edge AI and Vision Alliance
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...Edge AI and Vision Alliance
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...Edge AI and Vision Alliance
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from SamsaraEdge AI and Vision Alliance
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...Edge AI and Vision Alliance
 
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...Edge AI and Vision Alliance
 
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...Edge AI and Vision Alliance
 
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...Edge AI and Vision Alliance
 
“Tracking and Fusing Diverse Risk Factors to Drive a SAFER Future,” a Present...
“Tracking and Fusing Diverse Risk Factors to Drive a SAFER Future,” a Present...“Tracking and Fusing Diverse Risk Factors to Drive a SAFER Future,” a Present...
“Tracking and Fusing Diverse Risk Factors to Drive a SAFER Future,” a Present...Edge AI and Vision Alliance
 
“MIPI CSI-2 Image Sensor Interface Standard Features Enable Efficient Embedde...
“MIPI CSI-2 Image Sensor Interface Standard Features Enable Efficient Embedde...“MIPI CSI-2 Image Sensor Interface Standard Features Enable Efficient Embedde...
“MIPI CSI-2 Image Sensor Interface Standard Features Enable Efficient Embedde...Edge AI and Vision Alliance
 
“Introduction to the CSI-2 Image Sensor Interface Standard,” a Presentation f...
“Introduction to the CSI-2 Image Sensor Interface Standard,” a Presentation f...“Introduction to the CSI-2 Image Sensor Interface Standard,” a Presentation f...
“Introduction to the CSI-2 Image Sensor Interface Standard,” a Presentation f...Edge AI and Vision Alliance
 
“Practical Approaches to DNN Quantization,” a Presentation from Magic Leap
“Practical Approaches to DNN Quantization,” a Presentation from Magic Leap“Practical Approaches to DNN Quantization,” a Presentation from Magic Leap
“Practical Approaches to DNN Quantization,” a Presentation from Magic LeapEdge AI and Vision Alliance
 
"Optimizing Image Quality and Stereo Depth at the Edge," a Presentation from ...
"Optimizing Image Quality and Stereo Depth at the Edge," a Presentation from ..."Optimizing Image Quality and Stereo Depth at the Edge," a Presentation from ...
"Optimizing Image Quality and Stereo Depth at the Edge," a Presentation from ...Edge AI and Vision Alliance
 
“Using a Collaborative Network of Distributed Cameras for Object Tracking,” a...
“Using a Collaborative Network of Distributed Cameras for Object Tracking,” a...“Using a Collaborative Network of Distributed Cameras for Object Tracking,” a...
“Using a Collaborative Network of Distributed Cameras for Object Tracking,” a...Edge AI and Vision Alliance
 
“A Survey of Model Compression Methods,” a Presentation from Instrumental
“A Survey of Model Compression Methods,” a Presentation from Instrumental“A Survey of Model Compression Methods,” a Presentation from Instrumental
“A Survey of Model Compression Methods,” a Presentation from InstrumentalEdge AI and Vision Alliance
 
“Reinventing Smart Cities with Computer Vision,” a Presentation from Hayden AI
“Reinventing Smart Cities with Computer Vision,” a Presentation from Hayden AI“Reinventing Smart Cities with Computer Vision,” a Presentation from Hayden AI
“Reinventing Smart Cities with Computer Vision,” a Presentation from Hayden AIEdge AI and Vision Alliance
 

More from Edge AI and Vision Alliance (20)

“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
 
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
 
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
 
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
 
“Tracking and Fusing Diverse Risk Factors to Drive a SAFER Future,” a Present...
“Tracking and Fusing Diverse Risk Factors to Drive a SAFER Future,” a Present...“Tracking and Fusing Diverse Risk Factors to Drive a SAFER Future,” a Present...
“Tracking and Fusing Diverse Risk Factors to Drive a SAFER Future,” a Present...
 
“MIPI CSI-2 Image Sensor Interface Standard Features Enable Efficient Embedde...
“MIPI CSI-2 Image Sensor Interface Standard Features Enable Efficient Embedde...“MIPI CSI-2 Image Sensor Interface Standard Features Enable Efficient Embedde...
“MIPI CSI-2 Image Sensor Interface Standard Features Enable Efficient Embedde...
 
“Introduction to the CSI-2 Image Sensor Interface Standard,” a Presentation f...
“Introduction to the CSI-2 Image Sensor Interface Standard,” a Presentation f...“Introduction to the CSI-2 Image Sensor Interface Standard,” a Presentation f...
“Introduction to the CSI-2 Image Sensor Interface Standard,” a Presentation f...
 
“Practical Approaches to DNN Quantization,” a Presentation from Magic Leap
“Practical Approaches to DNN Quantization,” a Presentation from Magic Leap“Practical Approaches to DNN Quantization,” a Presentation from Magic Leap
“Practical Approaches to DNN Quantization,” a Presentation from Magic Leap
 
"Optimizing Image Quality and Stereo Depth at the Edge," a Presentation from ...
"Optimizing Image Quality and Stereo Depth at the Edge," a Presentation from ..."Optimizing Image Quality and Stereo Depth at the Edge," a Presentation from ...
"Optimizing Image Quality and Stereo Depth at the Edge," a Presentation from ...
 
“Using a Collaborative Network of Distributed Cameras for Object Tracking,” a...
“Using a Collaborative Network of Distributed Cameras for Object Tracking,” a...“Using a Collaborative Network of Distributed Cameras for Object Tracking,” a...
“Using a Collaborative Network of Distributed Cameras for Object Tracking,” a...
 
“A Survey of Model Compression Methods,” a Presentation from Instrumental
“A Survey of Model Compression Methods,” a Presentation from Instrumental“A Survey of Model Compression Methods,” a Presentation from Instrumental
“A Survey of Model Compression Methods,” a Presentation from Instrumental
 
“Reinventing Smart Cities with Computer Vision,” a Presentation from Hayden AI
“Reinventing Smart Cities with Computer Vision,” a Presentation from Hayden AI“Reinventing Smart Cities with Computer Vision,” a Presentation from Hayden AI
“Reinventing Smart Cities with Computer Vision,” a Presentation from Hayden AI
 

Recently uploaded

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 

Recently uploaded (20)

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 

“Designing Bespoke CNNs for Target Hardware,” a Presentation from StradVision

  • 1. © 2020 StradVision Designing Bespoke CNNs for Target Hardware Woonhyun Nam Director, Algorithms StradVision September, 2020
  • 2. © 2020 StradVision StradVision • We develop deep learning-based perception software for Advanced Driver Assistance Systems (ADAS) and Autonomous Vehicles 2 For more detailed information, please refer to https://youtu.be/tK7F8yvpiGA and https://stradvision.com/
  • 3. © 2020 StradVision Agenda • Development Process of Deep Learning-based Software • Scalability Challenge • Layer Transformations • Case Studies • Conclusions 3
  • 4. © 2020 StradVision Development Process of Deep Learning-based Software 4
  • 5. © 2020 StradVision Development Process Repeated for each task (e.g., object detection) and target platform 5 Design Train Deploy Validate • Reuse existing CNN designs • Tweak some layers • Retrain from scratch even for small changes • Revalidate from scratch if output changes (Pros) (Cons) Network Weights Binary Report
  • 7. © 2020 StradVision Development Process Scalability challenge • Number of networks to maintain grows rapidly • More than 100 networks become difficult to manage 7 Task Target platform # of edges = # of networks
  • 8. © 2020 StradVision Scalable Development Process Main idea • Grouping target platforms based on similar computational characteristics (or capabilities) with respect to a given CNN transformation 8 Task Target platform Same group # of edges = # of networks
  • 9. © 2020 StradVision Scalable Development Process • Design/Train CNNs for each group • Transform CNNs without retraining within same group 9 Design Train Transform Deploy Validate Network Weights Binary Report Network Binary Report Same group Weights
  • 10. © 2020 StradVision Scalable Development Process Cost-effective method • Increasing computational efficiency • E.g., lower latency, higher throughput, cheaper hardware • Increasing engineering efficiency • E.g., shorter development time, less engineers, more projects • Increasing economic efficiency • E.g., lower usage of cloud services for training, lower production-cost 10
  • 11. © 2020 StradVision Layer Transformations 11
  • 12. © 2020 StradVision Layer Transformations • Basic operations • Case studies • YUV Input • Fixed-size Kernel Acceleration in Convolution • No FC Support • No Custom Layer Support in Quantization • Block-level Accumulation in Convolution 12
  • 13. © 2020 StradVision Layer Transformations Basic operations • (Linear) layer fusion • Conv & Conv → Conv • Memory layout change • For Input, Weight, and Output tensors • Reshape, Transpose • Zero-filling 13 𝑦 = 𝑎𝑥 + 𝑏 𝑧 = 𝑐𝑦 + 𝑑 𝑧 = (𝑎𝑐)𝑥 + (𝑏𝑐 + 𝑑) and → 1 2 3 4 5 6 1 2 3 4 5 6 1 4 2 5 3 6 New weight New bias Reshape Transpose
  • 15. © 2020 StradVision Case Study – YUV Input Hypothetical target hardware • YUV input, instead of RGB • YUV2RGB equation is given → Fusion of YUV2RGB and the first Conv 15 Conv1 RGB Input New Conv1 YUV Input Conv1 YUV2RGB YUV Input Fusion 𝑅 𝐺 𝐵 = 𝑤𝑅𝑌 𝑤𝑅𝑈 𝑤𝑅𝑉 𝑤𝐺𝑌 𝑤𝐺𝑈 𝑤𝐺𝑉 𝑤𝐵𝑌 𝑤𝐵𝑈 𝑤𝐵𝑉 𝑌 + 𝑏𝑌 𝑈 + 𝑏𝑈 𝑉 + 𝑏𝑉 YUV2RGB equation
  • 16. © 2020 StradVision Case Study – YUV Input Equations for new weights and new bias 16 𝑅 𝐺 𝐵 = 𝑤𝑅𝑌 𝑤𝑅𝑈 𝑤𝑅𝑉 𝑤𝐺𝑌 𝑤𝐺𝑈 𝑤𝐺𝑉 𝑤𝐵𝑌 𝑤𝐵𝑈 𝑤𝐵𝑉 𝑌 + 𝑏𝑌 𝑈 + 𝑏𝑈 𝑉 + 𝑏𝑉 YUV2RGB equation 𝑍 = 𝑤𝑅 𝑤𝐺 𝑤𝐵 𝑅 𝐺 𝐵 + 𝑏 Conv1 equation 𝑍 = 𝑤𝑅 𝑤𝐺 𝑤𝐵 𝑤𝑅𝑌 𝑤𝑅𝑈 𝑤𝑅𝑉 𝑤𝐺𝑌 𝑤𝐺𝑈 𝑤𝐺𝑉 𝑤𝐵𝑌 𝑤𝐵𝑈 𝑤𝐵𝑉 𝑌 𝑈 𝑉 + 𝑏 + 𝑤𝑅 𝑤𝐺 𝑤𝐵 𝑤𝑅𝑌 𝑤𝑅𝑈 𝑤𝑅𝑉 𝑤𝐺𝑌 𝑤𝐺𝑈 𝑤𝐺𝑉 𝑤𝐵𝑌 𝑤𝐵𝑈 𝑤𝐵𝑉 𝑏𝑌 𝑏𝑈 𝑏𝑉 New Conv1 equation New weight New bias Fusion
  • 17. © 2020 StradVision Case Study – Fixed-size Kernel Acceleration in Convolution Hypothetical target hardware • Only 5x5 kernel acceleration supported • Applying zero-filling for smaller kernels • Computational utilization • 1x1 Conv ~ 4%, 3x3 Conv ~ 36% 17 w11 w12 w13 w21 w22 w23 w31 w32 w33 0 0 0 0 0 0 w11 w12 w13 0 0 w21 w22 w23 0 0 w31 w32 w33 0 0 0 0 0 0 Zero filling
  • 18. © 2020 StradVision 1x1 Convolution 4x1 Convolution Case Study – Fixed-size Kernel Acceleration in Convolution → Convert 1x1 Conv into 4(kh)x1(kw) Conv 18 Input (ic)x(ih)x(iw) Weight (oc)x(ic)x(1)x(1) Output.reshape(oc,oh,ow) (oc)x(oh)x(ow) Reshape Reshape Reshape iw=input width ih=input height ic=input channel kw=kernel width kh=kernel height ow=output width oh=output height oc=output channel Input.reshape(ic/4,4,ih*iw) (ic/4)x(4)x(ih*iw) Weight.reshape(oc,ic/4,4,1) (oc)x(ic/4)x(4)x(1) Output (oc)x(1)x(oh*ow)
  • 19. © 2020 StradVision 1x1 Convolution 4x1 Convolution 4 3 2 19 20 21 22 23 24 13 14 15 16 17 18 7 8 9 10 11 12 Case Study – Fixed-size Kernel Acceleration in Convolution → Convert 1x1 Conv into 4(kh)x1(kw) Conv 19 1 2 3 4 5 6 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 iw=3 ih=2 ic=4 iw’=ih*iw=6 ih’=ic=4 ic’=ic/4=1 Input 1 kw=1 kh=1 ic=4 oc=1 1 2 3 4 kw’=1 kh’=4 ic’=ic/4=1 oc’=oc=1 Weight 1 2 3 4 5 6 ow’=iw’=6 oh’=1 oc’=1 Output 1 2 3 4 5 6 ow=3 oh=2 Reshape Reshape Reshape iw=input width ih=input height ic=input channel kw=kernel width kh=kernel height ow=output width oh=output height oc=output channel
  • 20. © 2020 StradVision Case Study – No FC Support Hypothetical target hardware • No support for FC layers, or • Only single-batched FC layers supported 20 1 2 3 4 5 6 1 4 2 5 3 6 1 2 3 4 ic=3 batch=2 ic=3 oc=2 * = batch=2 oc=2 Matrix Multiplication in FC layer Input Weight Output
  • 21. © 2020 StradVision Case Study – No FC Support → Convert FC into 1x1 Conv 21 1 2 3 4 5 6 1 4 2 5 3 6 1 2 3 4 ic=3 batch=2 ic=3 oc=2 * = batch=2 oc=2 FC Input Weight Output 3 6 2 5 1 4 iw’=batch=2 ih’=1 3 2 1 kw’=1 kh’=1 oc’=oc=2 6 5 4 2 4 1 3 ow’=batch=2 oh’=1 Input Weight Output 1x1 Conv Transpose & Reshape Input (batch)x(ic) Weight (ic)x(oc) Output (batch)x(oc) * = transpose(Input) .reshape(ic,1,batch) (ic)x(1)x(batch) transpose(Weight) .reshape(oc,ic,1,1) (oc)x(ic)x(1)x(1) transpose(Output) .reshape(oc,1,batch) (oc)x(1)x(batch) * =
  • 22. © 2020 StradVision Case Study – No Custom Layer Support in Quantization Hypothetical target hardware • Quantization should be done by a tool provided by hardware vendor • No support for custom layers • The tool accepts only image files as quantization inputs → Convert input tensors of custom layers into image files • Useful for two-stage detection networks (E.g., Faster RCNN) 22
  • 23. © 2020 StradVision Case Study – No Custom Layer Support in Quantization Convert 4D tensors into image files 23 19 20 21 22 23 24 13 14 15 16 17 18 7 8 9 10 11 12 1 2 3 4 5 6 w=3 h=2 43 44 45 46 47 48 37 38 39 40 41 42 31 32 33 34 35 36 25 26 27 28 29 30 batch=2 2(batch)x4(c)x2(h)x3(w) Tensor in Int8 format 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 w’=h*w=6 h’=batch*c=8 8(h’)x6(w’) Gray image in Uint8 format Reshape & Add 128 Subtract 128 & Reshape Input.reshape(batch*c,h*w) + 128 (batch*c)x(h*w) Input (batch)x(c)x(h)x(w)
  • 24. © 2020 StradVision Case Study – Block-level Accumulation in Convolution Hypothetical target hardware • Accumulating intermediate results of Conv in 8(oc)x8(ic) channels • For example, 24 Input 24(ic)x32(ih)x32(iw) Weight 8(oc)x24(ic)x1(kh)x1(kw) Conv ( , ) Input[0:8,:,:] 8(ic)x32(ih)x32(iw) Weight[:,0:8,:,:] 8(oc)x8(ic)x1(kh)x1(kw) Conv8(oc)x8(ic) ( , ) Input[8:16,:,:] 8(ic)x32(ih)x32(iw) Weight[:,8:16,:,:] 8(oc)x8(ic)x1(kh)x1(kw) Conv8(oc)x8(ic) ( , ) = + Input[16:24,:,:] 8(ic)x32(ih)x32(iw) Weight[:,16:24,:,:] 8(oc)x8(ic)x1(kh)x1(kw) Conv8(oc)x8(ic) ( , ) +
  • 25. © 2020 StradVision Case Study – Block-level Accumulation in Convolution Hypothetical target hardware • Accumulating intermediate results of Conv in 8(oc)x8(ic) channels • For example, 25 1x1 Convolution 1 2 3 ih*iw ic=24 Input A B C D E F G H I J K L kh=kw=1 ic=24 oc=32 Weight 1A + 2B + 3C 1D + 2E + 3F 1G + 2H + 3I 1J + 2K + 3L oh*ow oc=32 Output
  • 26. © 2020 StradVision Case Study – Block-level Accumulation in Convolution → 8(oc)x8(ic) block-level sparsification 26 1x1 Conv Weight (dense) ic=64 oc=32 1x1 Conv Weight (53% sparse) ic=64 oc=32 Weightsparse(i-th block) = Weightdense(i-th block) otherwise 0 if Σ|Weightdense(i-th block)| < θ {
  • 28. © 2020 StradVision Conclusions • Layer transformation techniques can reduce production-cost especially for retraining of networks while supporting a diverse array of target platforms • It may also be beneficial for deep learning hardware manufacturers to support a wide variety of networks not yet natively supported by their hardware 28
  • 29. © 2020 StradVision Resources 29 • Sparsification • “Exploring the Granularity of Sparsity in Convolution Neural Networks”, CVPR2017 https://openaccess.thecvf.com/content_cvpr_2017_workshops/w29/html/Mao_Ex ploring_the_Granularity_CVPR_2017_paper.html