Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Opml 19-presentation-pdf
1. Transfer Learning for Performance
Modeling of Deep Neural Network
Systems
1
Md Shahriar Iqbal Lars Kotthoff Pooyan Jamshidi
2. Performance can change across systems
Can we expect similar performance in new environment? 2
Environment 1.0
Dev/Staging
Environment 2.0
Production
Push Performance
- Inference time
- Energy consumption
3. Case study: Performance behavior changes across
environments and so does optimal configuration
Default is
2.8 times bad
Default is
2.1 times bad
Default is
2.6 times bad
Default is
5.1 times bad
optimal
Optimal in
Environment 1.0
is 1.9 times bad
Optimal in
Environment 1.0
is 4.9 times bad
optimal
3
How can the developer tune parameters for energy
consumption during source code development?
4. DNN system stack allows developers to tune
parameters for energy consumption
core status, core frequency,
gpu status, gpu frequency,
memory controller frequency etc
4
Network
Level
Model
Compiler
Level
Deployment
Level
Hardware/OS
Level
2100
configurations possible
>100
DNN Systems are highly configurable
5. Building configuration space of DNN systems
5
Hardware/OS
Level
Core
frequency
Core
status
gpu
status
gpu
frequency
Core
status
Inference time
Energy consumption
6. A typical approach to understand system
behavior is sensitivity analysis
C O1
xO2
xO3
xO4....
xO20
c1
0x0x0x0....x1 y1
=f(c1
)
c2
0x0x1x0....x1 y2
=f(c2
)
c3
1x0x0x0....x0 y3
=f(c3
)
. . . . . . . f*
->f
cn
1x1x1x0....x1 yn
=f(cn
)
Training Set
learn
6
7. Performance models could be in any form of
black box models
C O1
xO2
xO3
xO4....
xO20
c1
0x0x0x0....x1 y1
=f(c1
)
c2
0x0x1x0....x1 y2
=f(c2
)
c3
1x0x0x0....x0 y3
=f(c3
)
. . . . . . . f*
->f
cn
1x1x1x0....x1 yn
=f(cn
)
Training Set
7
8. Evaluating a performance model
C O1
xO2
xO3
xO4....
xO20
c1
0x0x0x0....x1 y1
=f(c1
)
c2
0x0x1x0....x1 y2
=f(c2
)
c3
1x0x0x0....x0 y3
=f(c3
) MAPE(f*
,f)=
. . . . . . . f*
->f
cn
1x1x1x0....x1 yn
=f(cn
)
Training Set
8
9. A Performance Model contains useful
information about option interactions
f:C->R
f(.)=1.2+3O1
+5O2
+7O7
+8O1
O2
-3.4O1
O7
+5O3
O7
+11.2O1
O2
O7
9
10. What are the roadblocks for performance
modeling of DNN systems?
● Configuration space grows exponentially with addition of new options.
● Performance measurements are costly.
220
=1048576 configurations takes nearly 2 years for measurement considering
only 60s on average per configuration.
Transfer learning can help to reuse the cheap measurements
from old environments to new environments to quickly learn
system behavior using performance modeling. 10
12. Linear/Non-Linear
Model Shift
Transfer Learning
Linear and non-linear model
shift are learned using
source model (all samples)
and target model (random
sub-samples).
Experimentation Cost: 0
s fs
(s)
t*
ft
(t*
)
f*
Experimentation Cost: m(t*
)
12
15. Results: Inference Time
Cost Reduction of Error
Direct Model Transfer 0 1
Linear Model Shift 2.48 hours (0.15%) 3.48
Non-Linear Model Shift 2.48 hours (0.15%) 5.76
Guided Sampling 2.48 hours (0.15%) 22.93 15
16. Results: Energy Consumption
Cost Reduction of Error
Direct Model Transfer 0 1
Linear Model Shift 2.48 hours (0.15%) 7.19
Non-Linear Model Shift 2.48 hours (0.15%) 16.31
Guided sampling 2.48 hours (0.15%) 42.85 16
17. Useful tool for practitioners
● To build performance models in new environments using
only 2.44% of the configuration space.
● To avoid invalid configurations to quickly find optimal
configurations.
● To learn the performance landscape of a system for
performance debugging e.g., performance regression etc.
17