Opml 19-presentation-pdf

Transfer Learning for Performance
Modeling of Deep Neural Network
Systems
1
Md Shahriar Iqbal Lars Kotthoff Pooyan Jamshidi

Performance can change across systems
Can we expect similar performance in new environment? 2
Environment 1.0
Dev/Staging
Environment 2.0
Production
Push Performance
- Inference time
- Energy consumption

Case study: Performance behavior changes across
environments and so does optimal configuration
Default is
2.8 times bad
Default is
2.1 times bad
Default is
2.6 times bad
Default is
5.1 times bad
optimal
Optimal in
Environment 1.0
is 1.9 times bad
Optimal in
Environment 1.0
is 4.9 times bad
optimal
3
How can the developer tune parameters for energy
consumption during source code development?

DNN system stack allows developers to tune
parameters for energy consumption
core status, core frequency,
gpu status, gpu frequency,
memory controller frequency etc
4
Network
Level
Model
Compiler
Level
Deployment
Level
Hardware/OS
Level
2100
configurations possible
>100
DNN Systems are highly configurable

Building configuration space of DNN systems
5
Hardware/OS
Level
Core
frequency
Core
status
gpu
status
gpu
frequency
Core
status
Inference time
Energy consumption

A typical approach to understand system
behavior is sensitivity analysis
C O1
xO2
xO3
xO4....
xO20
c1
0x0x0x0....x1 y1
=f(c1
)
c2
0x0x1x0....x1 y2
=f(c2
)
c3
1x0x0x0....x0 y3
=f(c3
)
. . . . . . . f*
->f
cn
1x1x1x0....x1 yn
=f(cn
)
Training Set
learn
6

Performance models could be in any form of
black box models
C O1
xO2
xO3
xO4....
xO20
c1
0x0x0x0....x1 y1
=f(c1
)
c2
0x0x1x0....x1 y2
=f(c2
)
c3
1x0x0x0....x0 y3
=f(c3
)
. . . . . . . f*
->f
cn
1x1x1x0....x1 yn
=f(cn
)
Training Set
7

Evaluating a performance model
C O1
xO2
xO3
xO4....
xO20
c1
0x0x0x0....x1 y1
=f(c1
)
c2
0x0x1x0....x1 y2
=f(c2
)
c3
1x0x0x0....x0 y3
=f(c3
) MAPE(f*
,f)=
. . . . . . . f*
->f
cn
1x1x1x0....x1 yn
=f(cn
)
Training Set
8

A Performance Model contains useful
information about option interactions
f:C->R
f(.)=1.2+3O1
+5O2
+7O7
+8O1
O2
-3.4O1
O7
+5O3
O7
+11.2O1
O2
O7
9

What are the roadblocks for performance
modeling of DNN systems?
● Configuration space grows exponentially with addition of new options.
● Performance measurements are costly.
220
=1048576 configurations takes nearly 2 years for measurement considering
only 60s on average per configuration.
Transfer learning can help to reuse the cheap measurements
from old environments to new environments to quickly learn
system behavior using performance modeling. 10

Direct Model
Transfer
Learning
Model learned in source
environment is used directly
in target environment
11
s fs
(s)
t fs
(t)
Experimentation Cost: 0

Linear/Non-Linear
Model Shift
Transfer Learning
Linear and non-linear model
shift are learned using
source model (all samples)
and target model (random
sub-samples).
Experimentation Cost: 0
s fs
(s)
t*
ft
(t*
)
f*
Experimentation Cost: m(t*
)
12

Guided Sampling
Transfer
Learning
Influential configurations
are sampled in target
environment by studying
options interactions from
source environment to build
performance model.
s fs
(s)
t*
ft
(t*
)
Experimentation Cost: m(t*
)
13

Experimental Setup
Measurement
Analysis
Select
Environment Generate
Configuration
Inference
Compute
Inference Time
Compute Energy
Consumption
Extract
knowledge to
sample
configuration
space from
Source
Measurements
Transfer
Knowledge
Build Performance
Model in Target
Compute Prediction
Error
fs
(s)=a1
O1
+a2
O3
+a3
O7
+a4
O1
O3
+a5
O1
O7
+..+a7
O1
O3
O
14

Results: Inference Time
Cost Reduction of Error
Direct Model Transfer 0 1
Linear Model Shift 2.48 hours (0.15%) 3.48
Non-Linear Model Shift 2.48 hours (0.15%) 5.76
Guided Sampling 2.48 hours (0.15%) 22.93 15

Results: Energy Consumption
Cost Reduction of Error
Direct Model Transfer 0 1
Linear Model Shift 2.48 hours (0.15%) 7.19
Non-Linear Model Shift 2.48 hours (0.15%) 16.31
Guided sampling 2.48 hours (0.15%) 42.85 16

Useful tool for practitioners
● To build performance models in new environments using
only 2.44% of the configuration space.
● To avoid invalid configurations to quickly find optimal
configurations.
● To learn the performance landscape of a system for
performance debugging e.g., performance regression etc.
17

Opml 19-presentation-pdf

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Opml 19-presentation-pdf

Similar to Opml 19-presentation-pdf (20)

Recently uploaded

Recently uploaded (20)

Opml 19-presentation-pdf