SlideShare a Scribd company logo
1 of 8
Fast and Scalable In-
memory Deep Multitask
Learning via Neural Weight
Virtualization
 This paper introduces the concept of Neural Weight Virtualization – which enables fast and scalable in-
memory multitask deep learning on memory-constrained embedded systems.
 The goal of neural weight virtualization is:
 packing multiple DNNs into a fixed-sized main memory whose combined memory requirement is larger than the main
memory.
 enabling fast in-memory execution of the DNNs.
 Propose a two-phase approach:
 virtualization of weight parameters.
 in-memory data structure and run-time execution framework for in-memory execution and context-switching of DNN
tasks.
 Implement two multitask learning systems:
 (1) an embedded GPU-based mobile robot
 (2) a microcontroller-based IoT device.
 Paper evaluates the proposed algorithms as well as the two systems that involve ten state-of-the-art DNNs.
Our evaluation shows that weight virtualization improves memory efficiency, execution time, and energy
efficiency of the multitask learning systems by 4.1x, 36.9x, and 4.2x, respectively.
Introduction
 Packing multiple DNNs into the main memory via weight virtualization is
motivated by: only a small fraction of DNN weights have significant impacts
on the inference result. these high-significance weights are concentrated in a
few blocks in the main memory. Goal: designing a scalable DNN packing
algorithm that matches similar blocks of weights across multiple DNNs and
combines them to construct a single new block of virtual weights that is
shared by the DNNs.
 The matching process is followed by an optimization (retraining) process to
gain back any loss of the inference accuracy of the DNNs.
Weight Virtualization Framework
Two-phase weight virtualization framework (Figure 4) having an offline and an online phase.
Weight Virtualization (Offline).
 Weight virtualization is divided into three steps.
 For each DNN, its weight parameters are split into fixed-sized weight-pages. In Figure 4, the page
size is 100, and each DNN has up to four weight-pages as the main memory size is 400.
 The weight-pages of all DNNs are matched, and similar weight-pages are grouped together.
 The matched weight-pages in each group are combined to create a single virtual weight-page by
optimizing (re-training) all or some of the DNNs. The goal of this optimization is to retain the
accuracy of each DNN that shares one or more virtual weight-pages with others.
 In-Memory Execution (Online).
 The virtual weight-pages are loaded into the main memory of the system.
 For each DNN, a pagmatching table that points a set of memory addresses of virtual weight-pages
required by the DNN is generated.
 At run-time, a DNN is executed in the main memory with the necessary weight parameters obtained
by dereferencing its page matching table.
 Matching Objective. Given two sets of weight-pages, 𝑃𝑖 and 𝑃0, and a weight-
page matching cost function, 𝐶.
 The goal of the matching step is to find the closest set of pairs. Figure 6
illustrates a matching between task weight-page set 𝑃𝑖 and the virtual
weight-page set 𝑃0.
 Weight-Page Optimization
 combine the matched weight-pages into single pages and optimize (retrain)
the tasks to retain any accuracy loss due to the changed weight parameters.

More Related Content

Similar to PresentationDNN.pptx

Ccpro_Presentationfddgdddcfbbjjdssvhbvxf11.pptx
Ccpro_Presentationfddgdddcfbbjjdssvhbvxf11.pptxCcpro_Presentationfddgdddcfbbjjdssvhbvxf11.pptx
Ccpro_Presentationfddgdddcfbbjjdssvhbvxf11.pptx
19526YuvaKumarIrigi
 
Jovian DATA: A multidimensional database for the cloud
Jovian DATA: A multidimensional database for the cloudJovian DATA: A multidimensional database for the cloud
Jovian DATA: A multidimensional database for the cloud
Bharat Rane
 
Performance Evaluation of Server Consolidation Algorithms in Virtualized Clo...
Performance Evaluation of Server Consolidation Algorithms  in Virtualized Clo...Performance Evaluation of Server Consolidation Algorithms  in Virtualized Clo...
Performance Evaluation of Server Consolidation Algorithms in Virtualized Clo...
Susheel Thakur
 
JovianDATA MDX Engine Comad oct 22 2011
JovianDATA MDX Engine Comad oct 22 2011JovianDATA MDX Engine Comad oct 22 2011
JovianDATA MDX Engine Comad oct 22 2011
Satya Ramachandran
 
A request skew aware heterogeneous distributed
A request skew aware heterogeneous distributedA request skew aware heterogeneous distributed
A request skew aware heterogeneous distributed
João Gabriel Lima
 

Similar to PresentationDNN.pptx (20)

CLOUD COMPUTING – PARTITIONING ALGORITHM AND LOAD BALANCING ALGORITHM
CLOUD COMPUTING – PARTITIONING ALGORITHM AND LOAD BALANCING ALGORITHMCLOUD COMPUTING – PARTITIONING ALGORITHM AND LOAD BALANCING ALGORITHM
CLOUD COMPUTING – PARTITIONING ALGORITHM AND LOAD BALANCING ALGORITHM
 
CLOUD COMPUTING – PARTITIONING ALGORITHM AND LOAD BALANCING ALGORITHM
CLOUD COMPUTING – PARTITIONING ALGORITHM AND LOAD BALANCING ALGORITHMCLOUD COMPUTING – PARTITIONING ALGORITHM AND LOAD BALANCING ALGORITHM
CLOUD COMPUTING – PARTITIONING ALGORITHM AND LOAD BALANCING ALGORITHM
 
Ccpro_Presentationfddgdddcfbbjjdssvhbvxf11.pptx
Ccpro_Presentationfddgdddcfbbjjdssvhbvxf11.pptxCcpro_Presentationfddgdddcfbbjjdssvhbvxf11.pptx
Ccpro_Presentationfddgdddcfbbjjdssvhbvxf11.pptx
 
Scalable and adaptive data replica placement for geo distributed cloud storages
Scalable and adaptive data replica placement for geo distributed cloud storagesScalable and adaptive data replica placement for geo distributed cloud storages
Scalable and adaptive data replica placement for geo distributed cloud storages
 
MULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONS
MULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONSMULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONS
MULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONS
 
MULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONS
MULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONSMULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONS
MULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONS
 
A Step Towards A Scalable Dynamic Single Assignment Conversion
A Step Towards A Scalable Dynamic Single Assignment ConversionA Step Towards A Scalable Dynamic Single Assignment Conversion
A Step Towards A Scalable Dynamic Single Assignment Conversion
 
CONTENT BASED VIDEO CATEGORIZATION USING RELATIONAL CLUSTERING WITH LOCAL SCA...
CONTENT BASED VIDEO CATEGORIZATION USING RELATIONAL CLUSTERING WITH LOCAL SCA...CONTENT BASED VIDEO CATEGORIZATION USING RELATIONAL CLUSTERING WITH LOCAL SCA...
CONTENT BASED VIDEO CATEGORIZATION USING RELATIONAL CLUSTERING WITH LOCAL SCA...
 
Jovian DATA: A multidimensional database for the cloud
Jovian DATA: A multidimensional database for the cloudJovian DATA: A multidimensional database for the cloud
Jovian DATA: A multidimensional database for the cloud
 
shashank_hpca1995_00386533
shashank_hpca1995_00386533shashank_hpca1995_00386533
shashank_hpca1995_00386533
 
NETWORK-AWARE DATA PREFETCHING OPTIMIZATION OF COMPUTATIONS IN A HETEROGENEOU...
NETWORK-AWARE DATA PREFETCHING OPTIMIZATION OF COMPUTATIONS IN A HETEROGENEOU...NETWORK-AWARE DATA PREFETCHING OPTIMIZATION OF COMPUTATIONS IN A HETEROGENEOU...
NETWORK-AWARE DATA PREFETCHING OPTIMIZATION OF COMPUTATIONS IN A HETEROGENEOU...
 
WEIGHTED CONSTRAINT SATISFACTION AND GENETIC ALGORITHM TO SOLVE THE VIEW SELE...
WEIGHTED CONSTRAINT SATISFACTION AND GENETIC ALGORITHM TO SOLVE THE VIEW SELE...WEIGHTED CONSTRAINT SATISFACTION AND GENETIC ALGORITHM TO SOLVE THE VIEW SELE...
WEIGHTED CONSTRAINT SATISFACTION AND GENETIC ALGORITHM TO SOLVE THE VIEW SELE...
 
Performance Evaluation of Server Consolidation Algorithms in Virtualized Clo...
Performance Evaluation of Server Consolidation Algorithms  in Virtualized Clo...Performance Evaluation of Server Consolidation Algorithms  in Virtualized Clo...
Performance Evaluation of Server Consolidation Algorithms in Virtualized Clo...
 
Load Rebalancing for Distributed Hash Tables in Cloud Computing
Load Rebalancing for Distributed Hash Tables in Cloud ComputingLoad Rebalancing for Distributed Hash Tables in Cloud Computing
Load Rebalancing for Distributed Hash Tables in Cloud Computing
 
C017311316
C017311316C017311316
C017311316
 
SSBSE10.ppt
SSBSE10.pptSSBSE10.ppt
SSBSE10.ppt
 
OS virtual memory
OS virtual memoryOS virtual memory
OS virtual memory
 
JovianDATA MDX Engine Comad oct 22 2011
JovianDATA MDX Engine Comad oct 22 2011JovianDATA MDX Engine Comad oct 22 2011
JovianDATA MDX Engine Comad oct 22 2011
 
4 026
4 0264 026
4 026
 
A request skew aware heterogeneous distributed
A request skew aware heterogeneous distributedA request skew aware heterogeneous distributed
A request skew aware heterogeneous distributed
 

More from Tamer Nadeem (8)

Chapter 1.ppt
Chapter 1.pptChapter 1.ppt
Chapter 1.ppt
 
Semi-Supervised.pptx
Semi-Supervised.pptxSemi-Supervised.pptx
Semi-Supervised.pptx
 
4_1_indoorLocalization_1_fingerprint_deadreckoning.pptx
4_1_indoorLocalization_1_fingerprint_deadreckoning.pptx4_1_indoorLocalization_1_fingerprint_deadreckoning.pptx
4_1_indoorLocalization_1_fingerprint_deadreckoning.pptx
 
qos-f05.pdf
qos-f05.pdfqos-f05.pdf
qos-f05.pdf
 
lecture1.pdf
lecture1.pdflecture1.pdf
lecture1.pdf
 
cs229-probability_review_slides.pdf
cs229-probability_review_slides.pdfcs229-probability_review_slides.pdf
cs229-probability_review_slides.pdf
 
N00014 21-s-f003
N00014 21-s-f003N00014 21-s-f003
N00014 21-s-f003
 
Ci carplay-cic
Ci carplay-cicCi carplay-cic
Ci carplay-cic
 

Recently uploaded

Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPSSpellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
AnaAcapella
 
MuleSoft Integration with AWS Textract | Calling AWS Textract API |AWS - Clou...
MuleSoft Integration with AWS Textract | Calling AWS Textract API |AWS - Clou...MuleSoft Integration with AWS Textract | Calling AWS Textract API |AWS - Clou...
MuleSoft Integration with AWS Textract | Calling AWS Textract API |AWS - Clou...
MysoreMuleSoftMeetup
 
QUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lesson
QUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lessonQUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lesson
QUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lesson
httgc7rh9c
 
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
EADTU
 

Recently uploaded (20)

HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Tatlong Kwento ni Lola basyang-1.pdf arts
Tatlong Kwento ni Lola basyang-1.pdf artsTatlong Kwento ni Lola basyang-1.pdf arts
Tatlong Kwento ni Lola basyang-1.pdf arts
 
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPSSpellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
 
Observing-Correct-Grammar-in-Making-Definitions.pptx
Observing-Correct-Grammar-in-Making-Definitions.pptxObserving-Correct-Grammar-in-Making-Definitions.pptx
Observing-Correct-Grammar-in-Making-Definitions.pptx
 
MuleSoft Integration with AWS Textract | Calling AWS Textract API |AWS - Clou...
MuleSoft Integration with AWS Textract | Calling AWS Textract API |AWS - Clou...MuleSoft Integration with AWS Textract | Calling AWS Textract API |AWS - Clou...
MuleSoft Integration with AWS Textract | Calling AWS Textract API |AWS - Clou...
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
Including Mental Health Support in Project Delivery, 14 May.pdf
Including Mental Health Support in Project Delivery, 14 May.pdfIncluding Mental Health Support in Project Delivery, 14 May.pdf
Including Mental Health Support in Project Delivery, 14 May.pdf
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
Details on CBSE Compartment Exam.pptx1111
Details on CBSE Compartment Exam.pptx1111Details on CBSE Compartment Exam.pptx1111
Details on CBSE Compartment Exam.pptx1111
 
QUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lesson
QUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lessonQUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lesson
QUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lesson
 
diagnosting testing bsc 2nd sem.pptx....
diagnosting testing bsc 2nd sem.pptx....diagnosting testing bsc 2nd sem.pptx....
diagnosting testing bsc 2nd sem.pptx....
 
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
 
Michaelis Menten Equation and Estimation Of Vmax and Tmax.pptx
Michaelis Menten Equation and Estimation Of Vmax and Tmax.pptxMichaelis Menten Equation and Estimation Of Vmax and Tmax.pptx
Michaelis Menten Equation and Estimation Of Vmax and Tmax.pptx
 
Andreas Schleicher presents at the launch of What does child empowerment mean...
Andreas Schleicher presents at the launch of What does child empowerment mean...Andreas Schleicher presents at the launch of What does child empowerment mean...
Andreas Schleicher presents at the launch of What does child empowerment mean...
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 
What is 3 Way Matching Process in Odoo 17.pptx
What is 3 Way Matching Process in Odoo 17.pptxWhat is 3 Way Matching Process in Odoo 17.pptx
What is 3 Way Matching Process in Odoo 17.pptx
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
Diuretic, Hypoglycemic and Limit test of Heavy metals and Arsenic.-1.pdf
Diuretic, Hypoglycemic and Limit test of Heavy metals and Arsenic.-1.pdfDiuretic, Hypoglycemic and Limit test of Heavy metals and Arsenic.-1.pdf
Diuretic, Hypoglycemic and Limit test of Heavy metals and Arsenic.-1.pdf
 

PresentationDNN.pptx

  • 1. Fast and Scalable In- memory Deep Multitask Learning via Neural Weight Virtualization
  • 2.  This paper introduces the concept of Neural Weight Virtualization – which enables fast and scalable in- memory multitask deep learning on memory-constrained embedded systems.  The goal of neural weight virtualization is:  packing multiple DNNs into a fixed-sized main memory whose combined memory requirement is larger than the main memory.  enabling fast in-memory execution of the DNNs.  Propose a two-phase approach:  virtualization of weight parameters.  in-memory data structure and run-time execution framework for in-memory execution and context-switching of DNN tasks.  Implement two multitask learning systems:  (1) an embedded GPU-based mobile robot  (2) a microcontroller-based IoT device.  Paper evaluates the proposed algorithms as well as the two systems that involve ten state-of-the-art DNNs. Our evaluation shows that weight virtualization improves memory efficiency, execution time, and energy efficiency of the multitask learning systems by 4.1x, 36.9x, and 4.2x, respectively.
  • 3. Introduction  Packing multiple DNNs into the main memory via weight virtualization is motivated by: only a small fraction of DNN weights have significant impacts on the inference result. these high-significance weights are concentrated in a few blocks in the main memory. Goal: designing a scalable DNN packing algorithm that matches similar blocks of weights across multiple DNNs and combines them to construct a single new block of virtual weights that is shared by the DNNs.  The matching process is followed by an optimization (retraining) process to gain back any loss of the inference accuracy of the DNNs.
  • 4. Weight Virtualization Framework Two-phase weight virtualization framework (Figure 4) having an offline and an online phase. Weight Virtualization (Offline).  Weight virtualization is divided into three steps.  For each DNN, its weight parameters are split into fixed-sized weight-pages. In Figure 4, the page size is 100, and each DNN has up to four weight-pages as the main memory size is 400.  The weight-pages of all DNNs are matched, and similar weight-pages are grouped together.  The matched weight-pages in each group are combined to create a single virtual weight-page by optimizing (re-training) all or some of the DNNs. The goal of this optimization is to retain the accuracy of each DNN that shares one or more virtual weight-pages with others.  In-Memory Execution (Online).  The virtual weight-pages are loaded into the main memory of the system.  For each DNN, a pagmatching table that points a set of memory addresses of virtual weight-pages required by the DNN is generated.  At run-time, a DNN is executed in the main memory with the necessary weight parameters obtained by dereferencing its page matching table.
  • 5.
  • 6.
  • 7.  Matching Objective. Given two sets of weight-pages, 𝑃𝑖 and 𝑃0, and a weight- page matching cost function, 𝐶.  The goal of the matching step is to find the closest set of pairs. Figure 6 illustrates a matching between task weight-page set 𝑃𝑖 and the virtual weight-page set 𝑃0.
  • 8.  Weight-Page Optimization  combine the matched weight-pages into single pages and optimize (retrain) the tasks to retain any accuracy loss due to the changed weight parameters.