SlideShare a Scribd company logo
Architectural Tradeoff in
Learning-Based Software
Pooyan Jamshidi

Carnegie Mellon University
cs.cmu.edu/~pjamshid
SATURN 2018

Texas, May 8th, 2018
Who am I?
Who am I?
Who am I?
I hang around here!
Software
Arch.
Machine
Learning
Distributed
Systems
Goal: Enable developers/users
to find the right quality tradeoff
Today’s most popular (ML) systems are configurable
built
Today’s most popular (ML) systems are configurable
built
Today’s most popular (ML) systems are configurable
built
Today’s most popular (ML) systems are configurable
built
Today’s most popular (ML) systems are configurable
built
Empirical observations confirm that systems are
becoming increasingly configurable
08 7/2010 7/2012 7/2014
Release time
1/1999 1/2003 1/2007 1/2011
0
1/2014
N
Release time
02 1/2006 1/2010 1/2014
2.2.14
2.3.4
2.0.35
.3.24
Release time
Apache
1/2006 1/2008 1/2010 1/2012 1/2014
0
40
80
120
160
200
2.0.0
1.0.0
0.19.0
0.1.0
Hadoop
Numberofparameters
Release time
MapReduce
HDFS
[Tianyin Xu, et al., “Too Many Knobs…”, FSE’15]
Empirical observations confirm that systems are
becoming increasingly configurable
nia San Diego, ‡Huazhong Univ. of Science & Technology, †NetApp, Inc
tixu, longjin, xuf001, yyzhou}@cs.ucsd.edu
kar.Pasupathy, Rukma.Talwadker}@netapp.com
prevalent, but also severely
software. One fundamental
y of configuration, reflected
parameters (“knobs”). With
software to ensure high re-
aunting, error-prone task.
nderstanding a fundamental
users really need so many
answer, we study the con-
including thousands of cus-
m (Storage-A), and hundreds
ce system software projects.
ng findings to motivate soft-
ore cautious and disciplined
these findings, we provide
ich can significantly reduce
A as an example, the guide-
ters and simplify 19.7% of
on existing users. Also, we
tion methods in the context
7/2006 7/2008 7/2010 7/2012 7/2014
0
100
200
300
400
500
600
700
Storage-A
Numberofparameters
Release time
1/1999 1/2003 1/2007 1/2011
0
100
200
300
400
500
5.6.2
5.5.0
5.0.16
5.1.3
4.1.0
4.0.12
3.23.0
1/2014
MySQL
Numberofparameters
Release time
1/1998 1/2002 1/2006 1/2010 1/2014
0
100
200
300
400
500
600
1.3.14
2.2.14
2.3.4
2.0.35
1.3.24
Numberofparameters
Release time
Apache
1/2006 1/2008 1/2010 1/2012 1/2014
0
40
80
120
160
200
2.0.0
1.0.0
0.19.0
0.1.0
Hadoop
Numberofparameters
Release time
MapReduce
HDFS
[Tianyin Xu, et al., “Too Many Knobs…”, FSE’15]
“all constants should be configurable,
even if we can’t see any reason to
configure them.” — HDFS-4304
Configurations determine the performance
behavior
void Parrot_setenv(. . . name,. . . value){
#ifdef PARROT_HAS_SETENV
my_setenv(name, value, 1);
#else
int name_len=strlen(name);
int val_len=strlen(value);
char* envs=glob_env;
if(envs==NULL){
return;
}
strcpy(envs,name);
strcpy(envs+name_len,"=");
strcpy(envs+name_len + 1,value);
putenv(envs);
#endif
}
#ifdef LINUX
extern int Parrot_signbit(double x){
endif
else
PARROT_HAS_SETENV
LINUX
Configurations determine the performance
behavior
void Parrot_setenv(. . . name,. . . value){
#ifdef PARROT_HAS_SETENV
my_setenv(name, value, 1);
#else
int name_len=strlen(name);
int val_len=strlen(value);
char* envs=glob_env;
if(envs==NULL){
return;
}
strcpy(envs,name);
strcpy(envs+name_len,"=");
strcpy(envs+name_len + 1,value);
putenv(envs);
#endif
}
#ifdef LINUX
extern int Parrot_signbit(double x){
endif
else
PARROT_HAS_SETENV
LINUX
Configurations determine the performance
behavior
void Parrot_setenv(. . . name,. . . value){
#ifdef PARROT_HAS_SETENV
my_setenv(name, value, 1);
#else
int name_len=strlen(name);
int val_len=strlen(value);
char* envs=glob_env;
if(envs==NULL){
return;
}
strcpy(envs,name);
strcpy(envs+name_len,"=");
strcpy(envs+name_len + 1,value);
putenv(envs);
#endif
}
#ifdef LINUX
extern int Parrot_signbit(double x){
endif
else
PARROT_HAS_SETENV
LINUX
Configurations determine the performance
behavior
void Parrot_setenv(. . . name,. . . value){
#ifdef PARROT_HAS_SETENV
my_setenv(name, value, 1);
#else
int name_len=strlen(name);
int val_len=strlen(value);
char* envs=glob_env;
if(envs==NULL){
return;
}
strcpy(envs,name);
strcpy(envs+name_len,"=");
strcpy(envs+name_len + 1,value);
putenv(envs);
#endif
}
#ifdef LINUX
extern int Parrot_signbit(double x){
endif
else
PARROT_HAS_SETENV
LINUX
Configurations determine the performance
behavior
void Parrot_setenv(. . . name,. . . value){
#ifdef PARROT_HAS_SETENV
my_setenv(name, value, 1);
#else
int name_len=strlen(name);
int val_len=strlen(value);
char* envs=glob_env;
if(envs==NULL){
return;
}
strcpy(envs,name);
strcpy(envs+name_len,"=");
strcpy(envs+name_len + 1,value);
putenv(envs);
#endif
}
#ifdef LINUX
extern int Parrot_signbit(double x){
endif
else
PARROT_HAS_SETENV
LINUX
Speed
Energy
Configurations determine the performance
behavior
void Parrot_setenv(. . . name,. . . value){
#ifdef PARROT_HAS_SETENV
my_setenv(name, value, 1);
#else
int name_len=strlen(name);
int val_len=strlen(value);
char* envs=glob_env;
if(envs==NULL){
return;
}
strcpy(envs,name);
strcpy(envs+name_len,"=");
strcpy(envs+name_len + 1,value);
putenv(envs);
#endif
}
#ifdef LINUX
extern int Parrot_signbit(double x){
endif
else
PARROT_HAS_SETENV
LINUX
Speed
Energy
Configurations determine the performance
behavior
void Parrot_setenv(. . . name,. . . value){
#ifdef PARROT_HAS_SETENV
my_setenv(name, value, 1);
#else
int name_len=strlen(name);
int val_len=strlen(value);
char* envs=glob_env;
if(envs==NULL){
return;
}
strcpy(envs,name);
strcpy(envs+name_len,"=");
strcpy(envs+name_len + 1,value);
putenv(envs);
#endif
}
#ifdef LINUX
extern int Parrot_signbit(double x){
endif
else
PARROT_HAS_SETENV
LINUX
Speed
Energy
How do we understand performance behavior of
real-world highly-configurable systems that scale well…
How do we understand performance behavior of
real-world highly-configurable systems that scale well…
… and enable developers/users to reason about
qualities (performance, energy) and to make tradeoff?
Outline
Case
Study
Outline
Case
Study
Transfer
Learning
[SEAMS’17]
Outline
Case
Study
Transfer
Learning
Theory
Building
[SEAMS’17]
[ASE’17]
Outline
Case
Study
Transfer
Learning
Theory
Building
Guided
Sampling
[SEAMS’17]
[ASE’17]
Outline
Case
Study
Transfer
Learning
Theory
Building
Guided
Sampling
Future
Directions
[SEAMS’17]
[ASE’17]
SocialSensor
•Identifying trending topics

•Identifying user defined topics

•Social media search
SocialSensor
Internet
SocialSensor
Crawling
Tweets: [5k-20k/min]
Crawled
items
Internet
SocialSensor
Crawling
Tweets: [5k-20k/min]
Store
Crawled
items
Internet
SocialSensor
Orchestrator
Crawling
Tweets: [5k-20k/min]
Every 10 min:
[100k tweets]
Store
Crawled
items
FetchInternet
SocialSensor
Content AnalysisOrchestrator
Crawling
Tweets: [5k-20k/min]
Every 10 min:
[100k tweets]
Store
Push
Crawled
items
FetchInternet
SocialSensor
Content AnalysisOrchestrator
Crawling
Tweets: [5k-20k/min]
Every 10 min:
[100k tweets]
Tweets: [10M]
Store
Push
Store
Crawled
items
FetchInternet
SocialSensor
Content AnalysisOrchestrator
Crawling
Search and Integration
Tweets: [5k-20k/min]
Every 10 min:
[100k tweets]
Tweets: [10M]
Fetch
Store
Push
Store
Crawled
items
FetchInternet
Challenges
Content AnalysisOrchestrator
Crawling
Search and Integration
Tweets: [5k-20k/min]
Every 10 min:
[100k tweets]
Tweets: [10M]
Fetch
Store
Push
Store
Crawled
items
FetchInternet
Challenges
Content AnalysisOrchestrator
Crawling
Search and Integration
Tweets: [5k-20k/min]
Every 10 min:
[100k tweets]
Tweets: [10M]
Fetch
Store
Push
Store
Crawled
items
FetchInternet
100X
Challenges
Content AnalysisOrchestrator
Crawling
Search and Integration
Tweets: [5k-20k/min]
Every 10 min:
[100k tweets]
Tweets: [10M]
Fetch
Store
Push
Store
Crawled
items
FetchInternet
100X
10X
Challenges
Content AnalysisOrchestrator
Crawling
Search and Integration
Tweets: [5k-20k/min]
Every 10 min:
[100k tweets]
Tweets: [10M]
Fetch
Store
Push
Store
Crawled
items
FetchInternet
100X
10X
Real time
How can we gain a better performance without
using more resources?
Let’s try out different system configurations!
Opportunity: Data processing engines in the
pipeline were all configurable
Opportunity: Data processing engines in the
pipeline were all configurable
Opportunity: Data processing engines in the
pipeline were all configurable
Opportunity: Data processing engines in the
pipeline were all configurable
Opportunity: Data processing engines in the
pipeline were all configurable
> 100 > 100 > 100
Opportunity: Data processing engines in the
pipeline were all configurable
> 100 > 100 > 100
2300
0 500 1000 1500
Throughput (ops/sec)
0
1000
2000
3000
4000
5000
Averagewritelatency(s)
Default configuration was bad, so was the expert’
better
better
0 500 1000 1500
Throughput (ops/sec)
0
1000
2000
3000
4000
5000
Averagewritelatency(s)
Default configuration was bad, so was the expert’
Defaultbetter
better
0 500 1000 1500
Throughput (ops/sec)
0
1000
2000
3000
4000
5000
Averagewritelatency(s)
Default configuration was bad, so was the expert’
Default
Recommended
by an expert
better
better
0 500 1000 1500
Throughput (ops/sec)
0
1000
2000
3000
4000
5000
Averagewritelatency(s)
Default configuration was bad, so was the expert’
Default
Recommended
by an expert Optimal
Configuration
better
better
0 0.5 1 1.5 2 2.5
Throughput (ops/sec) 10
4
0
50
100
150
200
250
300
Latency(ms)
Default configuration was bad, so was the expert’
better
better
0 0.5 1 1.5 2 2.5
Throughput (ops/sec) 10
4
0
50
100
150
200
250
300
Latency(ms)
Default configuration was bad, so was the expert’
Default
better
better
0 0.5 1 1.5 2 2.5
Throughput (ops/sec) 10
4
0
50
100
150
200
250
300
Latency(ms)
Default configuration was bad, so was the expert’
Default
Recommended
by an expert
better
better
0 0.5 1 1.5 2 2.5
Throughput (ops/sec) 10
4
0
50
100
150
200
250
300
Latency(ms)
Default configuration was bad, so was the expert’
Default
Recommended
by an expert
Optimal
Configuration
better
better
Why this is an important problem?
Why this is an important problem?
Significant time saving
Why this is an important problem?
Significant time saving
• 2X-10X faster than worst
Why this is an important problem?
Significant time saving
• 2X-10X faster than worst
• Noticeably faster than median
Why this is an important problem?
Significant time saving
• 2X-10X faster than worst
• Noticeably faster than median
• Default is bad
Why this is an important problem?
Significant time saving
• 2X-10X faster than worst
• Noticeably faster than median
• Default is bad
• Expert’s is not optimal
Why this is an important problem?
Significant time saving
• 2X-10X faster than worst
• Noticeably faster than median
• Default is bad
• Expert’s is not optimal
Why this is an important problem?
Significant time saving
• 2X-10X faster than worst
• Noticeably faster than median
• Default is bad
• Expert’s is not optimal
Large configuration space
Why this is an important problem?
Significant time saving
• 2X-10X faster than worst
• Noticeably faster than median
• Default is bad
• Expert’s is not optimal
Large configuration space
• Exhaustive search is expensive
Why this is an important problem?
Significant time saving
• 2X-10X faster than worst
• Noticeably faster than median
• Default is bad
• Expert’s is not optimal
Large configuration space
• Exhaustive search is expensive
• Specific to hardware/workload/version
What did happen at the end?
• Achieved the objectives (100X user, same experience) 

• Saved money by reducing cloud resources up to 20%
• Our tool was able to identify configurations that was
consistently better than expert recommendation
Outline
Case
Study
Transfer
Learning
Theory
Building
Guided
Sampling
Future
Directions
To enable performance tradeoff, we need a model
to reason about qualities
void Parrot_setenv(. . . name,. . . value){
#ifdef PARROT_HAS_SETENV
my_setenv(name, value, 1);
#else
int name_len=strlen(name);
int val_len=strlen(value);
char* envs=glob_env;
if(envs==NULL){
return;
}
strcpy(envs,name);
strcpy(envs+name_len,"=");
strcpy(envs+name_len + 1,value);
putenv(envs);
#endif
}
#ifdef LINUX
endif
else
PARROT_HAS_SETENV
LINUX
f(·) = 5 + 3 ⇥ o1
Execution time (s)
To enable performance tradeoff, we need a model
to reason about qualities
void Parrot_setenv(. . . name,. . . value){
#ifdef PARROT_HAS_SETENV
my_setenv(name, value, 1);
#else
int name_len=strlen(name);
int val_len=strlen(value);
char* envs=glob_env;
if(envs==NULL){
return;
}
strcpy(envs,name);
strcpy(envs+name_len,"=");
strcpy(envs+name_len + 1,value);
putenv(envs);
#endif
}
#ifdef LINUX
endif
else
PARROT_HAS_SETENV
LINUX
f(·) = 5 + 3 ⇥ o1
Execution time (s)
f(o1 := 1) = 8
To enable performance tradeoff, we need a model
to reason about qualities
void Parrot_setenv(. . . name,. . . value){
#ifdef PARROT_HAS_SETENV
my_setenv(name, value, 1);
#else
int name_len=strlen(name);
int val_len=strlen(value);
char* envs=glob_env;
if(envs==NULL){
return;
}
strcpy(envs,name);
strcpy(envs+name_len,"=");
strcpy(envs+name_len + 1,value);
putenv(envs);
#endif
}
#ifdef LINUX
endif
else
PARROT_HAS_SETENV
LINUX
f(·) = 5 + 3 ⇥ o1
Execution time (s)
f(o1 := 0) = 5
f(o1 := 1) = 8
What is a performance model?
f : C ! R
f(o1, o2) = 5 + 3o1 + 15o2 7o1 ⇥ o2
c =< o1, o2 >
What is a performance model?
f : C ! R
f(o1, o2) = 5 + 3o1 + 15o2 7o1 ⇥ o2
c =< o1, o2 >
What is a performance model?
f : C ! R
f(o1, o2) = 5 + 3o1 + 15o2 7o1 ⇥ o2
c =< o1, o2 >
What is a performance model?
f : C ! R
f(o1, o2) = 5 + 3o1 + 15o2 7o1 ⇥ o2
c =< o1, o2 >
c =< o1, o2, ..., o10 >
What is a performance model?
f : C ! R
f(o1, o2) = 5 + 3o1 + 15o2 7o1 ⇥ o2
c =< o1, o2 >
c =< o1, o2, ..., o10 >
c =< o1, o2, ..., o100 >
···
How do we learn performance models?
TurtleBot
How do we learn performance models?
Measure
TurtleBot
Configurations
How do we learn performance models?
Measure
Learn
TurtleBot
Configurations
f(o1, o2) = 5 + 3o1 + 15o2 7o1 ⇥ o2
How do we learn performance models?
Measure
Learn
TurtleBot
Optimization
Reasoning
Debugging
Configurations
f(o1, o2) = 5 + 3o1 + 15o2 7o1 ⇥ o2
Insight: Performance measurements of the real
system is “similar” to the ones from the simulators
Measure
Configurations
Performance
Insight: Performance measurements of the real
system is “similar” to the ones from the simulators
Measure
Configurations
Performance
Insight: Performance measurements of the real
system is “similar” to the ones from the simulators
Measure
Simulator (Gazebo)
Data
Configurations
Performance
So why not reuse these data,
instead of measuring on real robot?
We developed methods to make learning cheaper
via transfer learning
Target (Learn)Source (Given)
DataModel
Transferable
Knowledge
II. INTUITION
Understanding the performance behavior of configurable
software systems can enable (i) performance debugging, (ii)
performance tuning, (iii) design-time evolution, or (iv) runtime
adaptation [11]. We lack empirical understanding of how the
performance behavior of a system will vary when the environ-
ment of the system changes. Such empirical understanding will
provide important insights to develop faster and more accurate
learning techniques that allow us to make predictions and
optimizations of performance for highly configurable systems
in changing environments [10]. For instance, we can learn
performance behavior of a system on a cheap hardware in a
controlled lab environment and use that to understand the per-
formance behavior of the system on a production server before
shipping to the end user. More specifically, we would like to
know, what the relationship is between the performance of a
system in a specific environment (characterized by software
configuration, hardware, workload, and system version) to the
one that we vary its environmental conditions.
In this research, we aim for an empirical understanding of
A. Preliminary concept
In this section, we p
cepts that we use throu
enable us to concisely
1) Configuration and
the i-th feature of a co
enabled or disabled an
configuration space is m
all the features C =
Dom(Fi) = {0, 1}. A
a member of the confi
all the parameters are
range (i.e., complete ins
We also describe an
e = [w, h, v] drawn fr
W ⇥H ⇥V , where they
values for workload, ha
2) Performance mod
configuration space F
formance model is a b
given some observation
combination of system
II. INTUITION
standing the performance behavior of configurable
systems can enable (i) performance debugging, (ii)
ance tuning, (iii) design-time evolution, or (iv) runtime
on [11]. We lack empirical understanding of how the
ance behavior of a system will vary when the environ-
the system changes. Such empirical understanding will
important insights to develop faster and more accurate
techniques that allow us to make predictions and
tions of performance for highly configurable systems
ging environments [10]. For instance, we can learn
ance behavior of a system on a cheap hardware in a
ed lab environment and use that to understand the per-
e behavior of the system on a production server before
to the end user. More specifically, we would like to
hat the relationship is between the performance of a
n a specific environment (characterized by software
ation, hardware, workload, and system version) to the
we vary its environmental conditions.
s research, we aim for an empirical understanding of
A. Preliminary concepts
In this section, we provide formal definitions of
cepts that we use throughout this study. The forma
enable us to concisely convey concept throughout
1) Configuration and environment space: Let F
the i-th feature of a configurable system A whic
enabled or disabled and one of them holds by de
configuration space is mathematically a Cartesian
all the features C = Dom(F1) ⇥ · · · ⇥ Dom(F
Dom(Fi) = {0, 1}. A configuration of a syste
a member of the configuration space (feature spa
all the parameters are assigned to a specific valu
range (i.e., complete instantiations of the system’s pa
We also describe an environment instance by 3
e = [w, h, v] drawn from a given environment s
W ⇥H ⇥V , where they respectively represent sets
values for workload, hardware and system version.
2) Performance model: Given a software syste
configuration space F and environmental instances
formance model is a black-box function f : F ⇥
given some observations of the system performanc
combination of system’s features x 2 F in an en
formance model is a black-box function f : F ⇥ E ! R
given some observations of the system performance for each
combination of system’s features x 2 F in an environment
e 2 E. To construct a performance model for a system A
with configuration space F, we run A in environment instance
e 2 E on various combinations of configurations xi 2 F, and
record the resulting performance values yi = f(xi) + ✏i, xi 2
F where ✏i ⇠ N (0, i). The training data for our regression
models is then simply Dtr = {(xi, yi)}n
i=1. In other words, a
response function is simply a mapping from the input space to
a measurable performance metric that produces interval-scaled
data (here we assume it produces real numbers).
3) Performance distribution: For the performance model,
we measured and associated the performance response to each
configuration, now let introduce another concept where we
vary the environment and we measure the performance. An
empirical performance distribution is a stochastic process,
pd : E ! (R), that defines a probability distribution over
performance measures for each environmental conditions. To
tem version) to the
ons.
al understanding of
g via an informed
learning a perfor-
ed on a well-suited
the knowledge we
the main research
information (trans-
o both source and
ore can be carried
r. This transferable
[10].
s that we consider
uration is a set of
s the primary vari-
rstand performance
ike to understand
der study will be
formance model is a black-box function f : F ⇥ E
given some observations of the system performance fo
combination of system’s features x 2 F in an enviro
e 2 E. To construct a performance model for a syst
with configuration space F, we run A in environment in
e 2 E on various combinations of configurations xi 2 F
record the resulting performance values yi = f(xi) + ✏i
F where ✏i ⇠ N (0, i). The training data for our regr
models is then simply Dtr = {(xi, yi)}n
i=1. In other wo
response function is simply a mapping from the input sp
a measurable performance metric that produces interval-
data (here we assume it produces real numbers).
3) Performance distribution: For the performance m
we measured and associated the performance response t
configuration, now let introduce another concept whe
vary the environment and we measure the performanc
empirical performance distribution is a stochastic pr
pd : E ! (R), that defines a probability distributio
performance measures for each environmental conditio
Extract Reuse
Learn Learn
Goal: Gain strength by
transferring information
across environments
TurtleBot
[P. Jamshidi, et al., “Transfer learning for improving model predictions ….”, SEAMS’17]
Our transfer learning solution
TurtleBot
Simulator (Gazebo)
[P. Jamshidi, et al., “Transfer learning for improving model predictions ….”, SEAMS’17]
Our transfer learning solution
Data
Measure
TurtleBot
Simulator (Gazebo)
[P. Jamshidi, et al., “Transfer learning for improving model predictions ….”, SEAMS’17]
Our transfer learning solution
DataData
Data
Measure
Measure
Reuse
TurtleBot
Simulator (Gazebo)
[P. Jamshidi, et al., “Transfer learning for improving model predictions ….”, SEAMS’17]
Configurations
Our transfer learning solution
DataData
Data
Measure
Measure
Reuse Learn
TurtleBot
Simulator (Gazebo)
[P. Jamshidi, et al., “Transfer learning for improving model predictions ….”, SEAMS’17]
Configurations
Our transfer learning solution
f(o1, o2) = 5 + 3o1 + 15o2 7o1 ⇥ o2
Gaussian processes for performance modeling
t = n
output,f(x)
input, x
Gaussian processes for performance modeling
t = n
Observation
output,f(x)
input, x
Gaussian processes for performance modeling
t = n
Observation
Mean
output,f(x)
input, x
Gaussian processes for performance modeling
t = n
Observation
Mean
Uncertainty
output,f(x)
input, x
Gaussian processes for performance modeling
t = n
Observation
Mean
Uncertainty
New
observation
output,f(x)
input, x
input, x
Gaussian processes for performance modeling
t = n t = n + 1
Observation
Mean
Uncertainty
New
observation
output,f(x)
input, x
Gaussian Processes enables reasoning about
performance
Step 1: Fit GP to the data seen
so far

Step 2: Explore the model for
regions of most variance

Step 3: Sample that region

Step 4: Repeat
-1.5 -1 -0.5 0 0.5 1 1.5
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
Configuration
Space
Empirical
Model
Experiment
Experiment
0 20 40 60 80 100 120 140 160 180 200
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
Selection Criteria
Sequential Design
CoBot experiment: DARPA BRASS
CoBot experiment: DARPA BRASS
CoBot experiment: DARPA BRASS
0 2 4 6 8
Localization error [m]
10
15
20
25
30
35
40
CPUutilization[%]
Pareto
front
better
better
no_of_particles=x
no_of_refinement=y
CoBot experiment: DARPA BRASS
0 2 4 6 8
Localization error [m]
10
15
20
25
30
35
40
CPUutilization[%]
Energy
constraint
Pareto
front
better
better
no_of_particles=x
no_of_refinement=y
CoBot experiment: DARPA BRASS
0 2 4 6 8
Localization error [m]
10
15
20
25
30
35
40
CPUutilization[%]
Energy
constraint
Safety
constraint
Pareto
front
better
better
no_of_particles=x
no_of_refinement=y
CoBot experiment: DARPA BRASS
0 2 4 6 8
Localization error [m]
10
15
20
25
30
35
40
CPUutilization[%]
Energy
constraint
Safety
constraint
Pareto
front
Sweet
Spot
better
better
no_of_particles=x
no_of_refinement=y
CoBot
experiment
CoBot
experiment
5 10 15 20 25
5
10
15
20
25
0
5
10
15
20
25
Source
(given)
CPU [%]
CoBot
experiment
5 10 15 20 25
5
10
15
20
25
0
5
10
15
20
25
5 10 15 20 25
5
10
15
20
25
0
5
10
15
20
25
Source
(given)
Target
(ground truth
6 months)
CPU [%] CPU [%]
CoBot
experiment
5 10 15 20 25
5
10
15
20
25
0
5
10
15
20
25
5 10 15 20 25
5
10
15
20
25
0
5
10
15
20
25
5 10 15 20 25
5
10
15
20
25
0
5
10
15
20
25
Source
(given)
Target
(ground truth
6 months)
Prediction with
4 samples
CPU [%] CPU [%]
CoBot
experiment
5 10 15 20 25
5
10
15
20
25
0
5
10
15
20
25
5 10 15 20 25
5
10
15
20
25
0
5
10
15
20
25
5 10 15 20 25
5
10
15
20
25
0
5
10
15
20
25
5 10 15 20 25
5
10
15
20
25
0
5
10
15
20
25
Source
(given)
Target
(ground truth
6 months)
Prediction with
4 samples
Prediction with
Transfer learning
CPU [%] CPU [%]
Results: Other configurable systems
CoBot WordCount SOL
RollingSort Cassandra (HW) Cassandra (DB)
Transfer Learning for Improving Model Predictions
in Highly Configurable Software
Pooyan Jamshidi, Miguel Velez, Christian K¨astner
Carnegie Mellon University, USA
{pjamshid,mvelezce,kaestner}@cs.cmu.edu
Norbert Siegmund
Bauhaus-University Weimar, Germany
norbert.siegmund@uni-weimar.de
Prasad Kawthekar
Stanford University, USA
pkawthek@stanford.edu
Abstract—Modern software systems are built to be used in
dynamic environments using configuration capabilities to adapt to
changes and external uncertainties. In a self-adaptation context,
we are often interested in reasoning about the performance of
the systems under different configurations. Usually, we learn
a black-box model based on real measurements to predict
the performance of the system given a specific configuration.
However, as modern systems become more complex, there are
many configuration parameters that may interact and we end up
learning an exponentially large configuration space. Naturally,
this does not scale when relying on real measurements in the
actual changing environment. We propose a different solution:
Instead of taking the measurements from the real system, we
learn the model using samples from other sources, such as
simulators that approximate performance of the real system at
Predictive Model
Learn Model with
Transfer Learning
Measure Measure
Data
Source
Target
Simulator (Source) Robot (Target)
Adaptation
Fig. 1: Transfer learning for performance model learning.
order to identify the best performing configuration for a robot
Details: [SEAMS ’17]
Summary (transfer learning)
5 10 15 20 25
5
10
15
20
25
0
5
10
15
20
25
5 10 15 20 25
5
10
15
20
25
0
5
10
15
20
25
5 10 15 20 25
5
10
15
20
25
0
5
10
15
20
25
5 10 15 20 25
5
10
15
20
25
0
5
10
15
20
25
Summary (transfer learning)
• Model for making tradeoff between qualities
5 10 15 20 25
5
10
15
20
25
0
5
10
15
20
25
5 10 15 20 25
5
10
15
20
25
0
5
10
15
20
25
5 10 15 20 25
5
10
15
20
25
0
5
10
15
20
25
5 10 15 20 25
5
10
15
20
25
0
5
10
15
20
25
Summary (transfer learning)
• Model for making tradeoff between qualities
• Scale to large space and environmental changes
5 10 15 20 25
5
10
15
20
25
0
5
10
15
20
25
5 10 15 20 25
5
10
15
20
25
0
5
10
15
20
25
5 10 15 20 25
5
10
15
20
25
0
5
10
15
20
25
5 10 15 20 25
5
10
15
20
25
0
5
10
15
20
25
Summary (transfer learning)
• Model for making tradeoff between qualities
• Scale to large space and environmental changes
• Transfer learning can help
5 10 15 20 25
5
10
15
20
25
0
5
10
15
20
25
5 10 15 20 25
5
10
15
20
25
0
5
10
15
20
25
5 10 15 20 25
5
10
15
20
25
0
5
10
15
20
25
5 10 15 20 25
5
10
15
20
25
0
5
10
15
20
25
Summary (transfer learning)
• Model for making tradeoff between qualities
• Scale to large space and environmental changes
• Transfer learning can help
• Increase prediction accuracy 5 10 15 20 25
5
10
15
20
25
0
5
10
15
20
25
5 10 15 20 25
5
10
15
20
25
0
5
10
15
20
25
5 10 15 20 25
5
10
15
20
25
0
5
10
15
20
25
5 10 15 20 25
5
10
15
20
25
0
5
10
15
20
25
Summary (transfer learning)
• Model for making tradeoff between qualities
• Scale to large space and environmental changes
• Transfer learning can help
• Increase prediction accuracy
• Increase model reliability
5 10 15 20 25
5
10
15
20
25
0
5
10
15
20
25
5 10 15 20 25
5
10
15
20
25
0
5
10
15
20
25
5 10 15 20 25
5
10
15
20
25
0
5
10
15
20
25
5 10 15 20 25
5
10
15
20
25
0
5
10
15
20
25
Summary (transfer learning)
• Model for making tradeoff between qualities
• Scale to large space and environmental changes
• Transfer learning can help
• Increase prediction accuracy
• Increase model reliability
• Decrease model building cost
5 10 15 20 25
5
10
15
20
25
0
5
10
15
20
25
5 10 15 20 25
5
10
15
20
25
0
5
10
15
20
25
5 10 15 20 25
5
10
15
20
25
0
5
10
15
20
25
5 10 15 20 25
5
10
15
20
25
0
5
10
15
20
25
Outline
Case
Study
Transfer
Learning
Theory
Building
Guided
Sampling
Future
Directions
Looking further: When transfer learning goes
wrong
10
20
30
40
50
60
AbsolutePercentageError[%]
Sources s s1 s2 s3 s4 s5 s6
noise-level 0 5 10 15 20 25 30
corr. coeff. 0.98 0.95 0.89 0.75 0.54 0.34 0.19
µ(pe) 15.34 14.14 17.09 18.71 33.06 40.93 46.75
Insight: Predictions become
more accurate when the source
is more related to the target.
Non-transfer-learning
Looking further: When transfer learning goes
wrong
10
20
30
40
50
60
AbsolutePercentageError[%]
Sources s s1 s2 s3 s4 s5 s6
noise-level 0 5 10 15 20 25 30
corr. coeff. 0.98 0.95 0.89 0.75 0.54 0.34 0.19
µ(pe) 15.34 14.14 17.09 18.71 33.06 40.93 46.75
It worked!
Insight: Predictions become
more accurate when the source
is more related to the target.
Non-transfer-learning
Looking further: When transfer learning goes
wrong
10
20
30
40
50
60
AbsolutePercentageError[%]
Sources s s1 s2 s3 s4 s5 s6
noise-level 0 5 10 15 20 25 30
corr. coeff. 0.98 0.95 0.89 0.75 0.54 0.34 0.19
µ(pe) 15.34 14.14 17.09 18.71 33.06 40.93 46.75
It worked! It didn’t!
Insight: Predictions become
more accurate when the source
is more related to the target.
Non-transfer-learning
Our empirical study: We looked at different highly-
configurable systems to gain insights
[P. Jamshidi, et al., “Transfer learning for performance modeling of configurable systems….”, ASE’17]
SPEAR	(SAT	Solver)
Analysis	time
14	options	
16,384	configurations
SAT	problems
3	hardware
2	versions
X264	(video	encoder)
Encoding	time
16	options	
4,000 configurations
Video	quality/size
2	hardware
3	versions
SQLite	(DB	engine)
Query	time
14	options	
1,000 configurations
DB	Queries
2	hardware
2 versions
SaC (Compiler)
Execution	time
50	options	
71,267 configurations
10	Demo	programs
Observation 1: Linear shift happens only in limited
environmental changes
Soft Environmental change Severity Corr.
SPEAR
NUC/2 -> NUC/4 Small 1.00
Amazon_nano -> NUC Large 0.59
Hardware/workload/version V Large -0.10
x264
Version Large 0.06
Workload Medium 0.65
SQLite
write-seq -> write-batch Small 0.96
read-rand -> read-seq Medium 0.50
Target
Source
Throughput
Implication: Simple transfer learning is limited
to hardware changes in practice
log P(θ, Xobs )
Θ
l
P(θ|Xobs)
Θ
Figure 5: The first column shows the log joint probab
Observation 1: Linear shift happens only in limited
environmental changes
Soft Environmental change Severity Corr.
SPEAR
NUC/2 -> NUC/4 Small 1.00
Amazon_nano -> NUC Large 0.59
Hardware/workload/version V Large -0.10
x264
Version Large 0.06
Workload Medium 0.65
SQLite
write-seq -> write-batch Small 0.96
read-rand -> read-seq Medium 0.50
Target
Source
Throughput
Implication: Simple transfer learning is limited
to hardware changes in practice
log P(θ, Xobs )
Θ
l
P(θ|Xobs)
Θ
Figure 5: The first column shows the log joint probab
Observation 1: Linear shift happens only in limited
environmental changes
Soft Environmental change Severity Corr.
SPEAR
NUC/2 -> NUC/4 Small 1.00
Amazon_nano -> NUC Large 0.59
Hardware/workload/version V Large -0.10
x264
Version Large 0.06
Workload Medium 0.65
SQLite
write-seq -> write-batch Small 0.96
read-rand -> read-seq Medium 0.50
Target
Source
Throughput
Implication: Simple transfer learning is limited
to hardware changes in practice
log P(θ, Xobs )
Θ
l
P(θ|Xobs)
Θ
Figure 5: The first column shows the log joint probab
Observation 1: Linear shift happens only in limited
environmental changes
Soft Environmental change Severity Corr.
SPEAR
NUC/2 -> NUC/4 Small 1.00
Amazon_nano -> NUC Large 0.59
Hardware/workload/version V Large -0.10
x264
Version Large 0.06
Workload Medium 0.65
SQLite
write-seq -> write-batch Small 0.96
read-rand -> read-seq Medium 0.50
Target
Source
Throughput
Implication: Simple transfer learning is limited
to hardware changes in practice
log P(θ, Xobs )
Θ
l
P(θ|Xobs)
Θ
Figure 5: The first column shows the log joint probab
Soft Environmental change Severity Dim t-test
x264
Version Large
16
12 10
Hardware/workload/ver V Large 8 9
SQLite
write-seq -> write-batch V Large
14
3 4
read-rand -> read-seq Medium 1 1
SaC Workload V Large 50 16 10
Implication: Avoid wasting budget on non-informative part
of configuration space and focusing where it matters.
Observation 2: Influential options and interactions
are preserved across environments
Soft Environmental change Severity Dim t-test
x264
Version Large
16
12 10
Hardware/workload/ver V Large 8 9
SQLite
write-seq -> write-batch V Large
14
3 4
read-rand -> read-seq Medium 1 1
SaC Workload V Large 50 16 10
Implication: Avoid wasting budget on non-informative part
of configuration space and focusing where it matters.
Observation 2: Influential options and interactions
are preserved across environments
Soft Environmental change Severity Dim t-test
x264
Version Large
16
12 10
Hardware/workload/ver V Large 8 9
SQLite
write-seq -> write-batch V Large
14
3 4
read-rand -> read-seq Medium 1 1
SaC Workload V Large 50 16 10
Implication: Avoid wasting budget on non-informative part
of configuration space and focusing where it matters.
Observation 2: Influential options and interactions
are preserved across environments
Soft Environmental change Severity Dim t-test
x264
Version Large
16
12 10
Hardware/workload/ver V Large 8 9
SQLite
write-seq -> write-batch V Large
14
3 4
read-rand -> read-seq Medium 1 1
SaC Workload V Large 50 16 10
Implication: Avoid wasting budget on non-informative part
of configuration space and focusing where it matters.
Observation 2: Influential options and interactions
are preserved across environments
216
250
= 0.000000000058
We only need to
explore part of
the space:
Transfer Learning for Performance Modeling of
Configurable Systems: An Exploratory Analysis
Pooyan Jamshidi
Carnegie Mellon University, USA
Norbert Siegmund
Bauhaus-University Weimar, Germany
Miguel Velez, Christian K¨astner
Akshay Patel, Yuvraj Agarwal
Carnegie Mellon University, USA
Abstract—Modern software systems provide many configura-
tion options which significantly influence their non-functional
properties. To understand and predict the effect of configuration
options, several sampling and learning strategies have been
proposed, albeit often with significant cost to cover the highly
dimensional configuration space. Recently, transfer learning has
been applied to reduce the effort of constructing performance
models by transferring knowledge about performance behavior
across environments. While this line of research is promising to
learn more accurate models at a lower cost, it is unclear why
and when transfer learning works for performance modeling. To
shed light on when it is beneficial to apply transfer learning, we
conducted an empirical study on four popular software systems,
varying software configurations and environmental conditions,
such as hardware, workload, and software versions, to identify
the key knowledge pieces that can be exploited for transfer
learning. Our results show that in small environmental changes
(e.g., homogeneous workload change), by applying a linear
transformation to the performance model, we can understand
the performance behavior of the target environment, while for
severe environmental changes (e.g., drastic workload change) we
can transfer only knowledge that makes sampling more efficient,
e.g., by reducing the dimensionality of the configuration space.
Index Terms—Performance analysis, transfer learning.
Fig. 1: Transfer learning is a form of machine learning that takes
advantage of transferable knowledge from source to learn an accurate,
reliable, and less costly model for the target environment.
their byproducts across environments is demanded by many
Details: [ASE ’17]
Outline
Case
Study
Transfer
Learning
Empirical
Study
Guided
Sampling
Vision
What will the software systems
of the future look like?
Software 2.0
Increasingly customized and configurable
VISION
Software 2.0
Increasingly customized and configurable
VISION
Software 2.0
Increasingly customized and configurable
VISION
Software 2.0
Increasingly customized and configurable
VISION
Software 2.0
Increasingly customized and configurable
VISION
Increasingly competing objectives
Accuracy
Training speed
Inference speed
Model size
Energy
Deep neural
network as a
highly
configurable
system
trans1
N: 40, op: [1, 1, 1, 0], groups: [4, 2, 2, X]
ks: [[3, 5], [5, 7], [3, 3], X]
d: [[2, 2], [1, 1], [3, 1], X]
N: 16, op: [1, 1, 0, 0], groups: [2, 1, X, X]
ks: [[5, 5], [3, 5], X, X]
d: [[1, 1], [3, 1], X, X]
N: 48, op: [1, 1, 1, 1], groups: [1, 1, 1, 4]
ks: [[5, 5], [7, 3], [3, 3], [3, 7]]
d: [[1, 1], [1, 3], [1, 2], [3, 1]]
N: 16, op: [1, 1, 1, 0], groups: [1, 4, 4, X]
ks: [[7, 3], [3, 5], [5, 5], X]
d: [[1, 1], [2, 1], [2, 2], X]
N: 16, op: [1, 1, 1, 1], groups: [4, 4, 2, 4]
ks: [[7, 7], [5, 3], [3, 5], [3, 5]]
d: [[1, 1], [1, 2], [3, 1], [1, 1]]
N: 8, op: [1, 1, 1, 1], groups: [4, 1, 1, 2]
ks: [[5, 3], [3, 5], [7, 5], [5, 5]]
d: [[2, 1], [3, 2], [1, 2], [1, 2]]
N: 48, op: [1, 1, 1, 1], groups: [1, 4, 4, 2]
ks: [[7, 3], [7, 7], [5, 5], [5, 7]]
d: [[1, 2], [1, 1], [1, 1], [1, 1]]
N: 16, op: [1, 1, 0, 0], groups: [2, 2, X, X]
ks: [[5, 7], [5, 3], X, X]
d: [[2, 2], [1, 2], X, X]
N: 56, op: [1, 0, 0, 0], groups: [2, X, X, X]
ks: [[3, 7], X, X, X]
d: [[1, 1], X, X, X]
N: 48, op: [1, 1, 1, 0], groups: [4, 2, 1, X]
ks: [[3, 5], [7, 5], [7, 5], X]
d: [[3, 2], [1, 2], [1, 2], X]
N: 56, op: [1, 0, 0, 0], groups: [1, X, X, X]
ks: [[5, 3], X, X, X]
d: [[2, 1], X, X, X]
+
N: 64, op: [1, 1, 1, 1], groups: [1, 1, 4, 2]
ks: [[5, 5], [5, 3], [5, 3], [7, 5]]
d: [[2, 1], [1, 1], [2, 1], [1, 1]]
+
N: 32, op: [1, 1, 0, 0], groups: [2, 2, X, X]
ks: [[7, 5], [3, 7], X, X]
d: [[1, 2], [3, 1], X, X]
N: 32, op: [1, 1, 1, 0], groups: [2, 2, 1, X]
ks: [[7, 3], [7, 7], [3, 7], X]
d: [[1, 3], [1, 1], [3, 1], X]
N: 40, op: [1, 1, 0, 0], groups: [2, 4, X, X]
ks: [[3, 7], [5, 7], X, X]
d: [[1, 1], [1, 1], X, X]
+
+
+
N: 8, op: [1, 1, 1, 1], groups: [4, 2, 1, 2]
ks: [[3, 7], [7, 5], [3, 3], [3, 7]]
d: [[3, 1], [1, 1], [2, 1], [2, 1]]
+++ +
N: 16, op: [1, 1, 1, 0], groups: [1, 4, 4, X]
ks: [[7, 5], [3, 7], [5, 7], X]
d: [[1, 1], [1, 1], [1, 1], X]
++
N: 56, op: [1, 1, 1, 1], groups: [1, 4, 1, 1]
ks: [[7, 5], [3, 3], [3, 3], [3, 3]]
d: [[1, 1], [1, 2], [2, 1], [3, 1]]
N: 40, op: [1, 1, 1, 1], groups: [1, 4, 4, 2]
ks: [[3, 3], [3, 5], [5, 3], [3, 3]]
d: [[1, 3], [1, 1], [1, 3], [3, 2]]
+
N: 24, op: [1, 1, 1, 1], groups: [2, 1, 1, 2]
ks: [[7, 3], [3, 3], [7, 3], [7, 5]]
d: [[1, 2], [2, 3], [1, 3], [1, 1]]
N: 32, op: [1, 1, 1, 1], groups: [1, 1, 1, 1]
ks: [[5, 5], [5, 3], [5, 7], [7, 3]]
d: [[2, 2], [1, 3], [1, 1], [1, 2]]
N: 24, op: [1, 1, 1, 1], groups: [2, 1, 4, 1]
ks: [[7, 7], [7, 7], [7, 3], [7, 5]]
d: [[1, 1], [1, 1], [1, 3], [1, 1]]
N: 40, op: [1, 1, 1, 1], groups: [1, 1, 2, 1]
ks: [[7, 3], [7, 5], [5, 3], [3, 3]]
d: [[1, 3], [1, 2], [2, 3], [1, 1]]
+
N: 24, op: [1, 1, 1, 0], groups: [4, 1, 2, X]
ks: [[5, 7], [7, 3], [5, 5], X]
d: [[1, 1], [1, 1], [1, 2], X]
++ + ++ + ++ ++
+
+ +
+
+
+
+ +
+
+ +
+
+
+
+
+ + ++
output
++
N: 24, op: [1, 0, 0, 0], groups: [2, X, X, X]
ks: [[7, 5], X, X, X]
d: [[1, 2], X, X, X]
N: 56, op: [1, 1, 1, 1], groups: [4, 4, 4, 4]
ks: [[7, 5], [3, 3], [3, 7], [5, 5]]
d: [[1, 2], [2, 1], [3, 1], [1, 2]]
+
+
+
N: 24, op: [1, 1, 1, 0], groups: [2, 2, 4, X]
ks: [[5, 5], [5, 3], [5, 5], X]
d: [[1, 1], [1, 2], [1, 2], X]
N: 56, op: [1, 1, 1, 1], groups: [2, 1, 1, 1]
ks: [[5, 7], [7, 3], [3, 5], [7, 5]]
d: [[2, 1], [1, 2], [2, 1], [1, 2]]
N: 40, op: [1, 0, 0, 0], groups: [2, X, X, X]
ks: [[7, 5], X, X, X]
d: [[1, 2], X, X, X]
N: 64, op: [1, 1, 1, 1], groups: [4, 1, 1, 1]
ks: [[7, 3], [5, 3], [7, 7], [5, 7]]
d: [[1, 1], [2, 2], [1, 1], [1, 1]]
+
+
+
+
+
+
N: 56, op: [1, 1, 1, 0], groups: [4, 2, 4, X]
ks: [[7, 5], [3, 7], [7, 5], X]
d: [[1, 2], [3, 1], [1, 2], X]
N: 32, op: [1, 1, 0, 0], groups: [1, 4, X, X]
ks: [[5, 3], [3, 3], X, X]
d: [[1, 2], [3, 2], X, X]
+
+
+
N: 16, op: [1, 1, 0, 0], groups: [1, 1, X, X]
ks: [[5, 5], [3, 7], X, X]
d: [[1, 2], [3, 1], X, X]
+
+
+
+
+
+
+
+
+
+ ++
+
++
+
+
+
++
+
+ +
Exploring the design space of deep networks
accuracy is reported on the CIFAR-10 test set. We note that the test set
tion, and it is only used for final model evaluation. We also evaluate the
in a large-scale setting on the ImageNet challenge dataset (Sect. 4.3).
globalpool
linear&softmax
sep.conv3x3/2
sep.conv3x3
sep.conv3x3/2
sep.conv3x3
sep.conv3x3
sep.conv3x3
image
conv3x3
globalpool
linear&softmax
large CIFAR-10 model
cell
cell
cell
cell
cell
cell
sep.conv3x3/2
sep.conv3x3
sep.conv3x3/2
sep.conv3x3
sep.conv3x3
sep.conv3x3
sep.conv3x3/2
globalpool
linear&softmax
ImageNet model
cell
cell
cell
cell
cell
cell
n models constructed using the cells optimized with architecture search.
during architecture search on CIFAR-10. Top-right: large CIFAR-10
Optimal Architecture
(Yesterday)
Exploring the design space of deep networks
accuracy is reported on the CIFAR-10 test set. We note that the test set
tion, and it is only used for final model evaluation. We also evaluate the
in a large-scale setting on the ImageNet challenge dataset (Sect. 4.3).
globalpool
linear&softmax
sep.conv3x3/2
sep.conv3x3
sep.conv3x3/2
sep.conv3x3
sep.conv3x3
sep.conv3x3
image
conv3x3
globalpool
linear&softmax
large CIFAR-10 model
cell
cell
cell
cell
cell
cell
sep.conv3x3/2
sep.conv3x3
sep.conv3x3/2
sep.conv3x3
sep.conv3x3
sep.conv3x3
sep.conv3x3/2
globalpool
linear&softmax
ImageNet model
cell
cell
cell
cell
cell
cell
n models constructed using the cells optimized with architecture search.
during architecture search on CIFAR-10. Top-right: large CIFAR-10
Optimal Architecture
(Yesterday)
New Fraud Pattern
Exploring the design space of deep networks
accuracy is reported on the CIFAR-10 test set. We note that the test set
tion, and it is only used for final model evaluation. We also evaluate the
in a large-scale setting on the ImageNet challenge dataset (Sect. 4.3).
globalpool
linear&softmax
sep.conv3x3/2
sep.conv3x3
sep.conv3x3/2
sep.conv3x3
sep.conv3x3
sep.conv3x3
image
conv3x3
globalpool
linear&softmax
large CIFAR-10 model
cell
cell
cell
cell
cell
cell
sep.conv3x3/2
sep.conv3x3
sep.conv3x3/2
sep.conv3x3
sep.conv3x3
sep.conv3x3
sep.conv3x3/2
globalpool
linear&softmax
ImageNet model
cell
cell
cell
cell
cell
cell
n models constructed using the cells optimized with architecture search.
during architecture search on CIFAR-10. Top-right: large CIFAR-10
milar approach has recently been used in (Zoph et al., 2017; Zhong et al., 2017).
rchitecture search is carried out entirely on the CIFAR-10 training set, which we split into two
ub-sets of 40K training and 10K validation images. Candidate models are trained on the training
ubset, and evaluated on the validation subset to obtain the fitness. Once the search process is over,
e selected cell is plugged into a large model which is trained on the combination of training and
alidation sub-sets, and the accuracy is reported on the CIFAR-10 test set. We note that the test set
never used for model selection, and it is only used for final model evaluation. We also evaluate the
ells, learned on CIFAR-10, in a large-scale setting on the ImageNet challenge dataset (Sect. 4.3).
sep.conv3x3/2
sep.conv3x3/2
sep.conv3x3
conv3x3
globalpool
linear&softmax
small CIFAR-10 model
cell
cell
cell
sep.conv3x3/2
sep.conv3x3
sep.conv3x3/2
sep.conv3x3
sep.conv3x3
sep.conv3x3
image
conv3x3
globalpool
linear&softmax
large CIFAR-10 model
cell
cell
cell
cell
cell
cell
sep.conv3x3/2
sep.conv3x3
sep.conv3x3/2
sep.conv3x3
sep.conv3x3
sep.conv3x3
image
conv3x3/2
conv3x3/2
sep.conv3x3/2
globalpool
linear&softmax
cell
cell
cell
cell
cell
cell
cell
Optimal Architecture
(Yesterday)
Optimal Architecture
(Today)
New Fraud Pattern
Deep architecture deployment
Deep architecture deployment
Deep architecture deployment
[2.44, 0.88]
[9.93, 0.77]
To achieve the optimal results, it is important to
explore:
• Network architectures

• Software libraries [TensorFlow, CNTK, Theano, etc]

• Deployment infrastructure
Exploring the design space of deep networks
0.2 0.4 0.6 0.8 1 1.2
Inference time [h]
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Validationerror
better
better
Exploring the design space of deep networks
0.2 0.4 0.6 0.8 1 1.2
Inference time [h]
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Validationerror
better
better
Exploring the design space of deep networks
0.2 0.4 0.6 0.8 1 1.2
Inference time [h]
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Validationerror
Default
subset, and evaluated on the validation subset
the selected cell is plugged into a large mode
validation sub-sets, and the accuracy is report
is never used for model selection, and it is only
cells, learned on CIFAR-10, in a large-scale se
image
sep.conv3x3/2
sep.conv3x3/2
sep.conv3x3
conv3x3
globalpool
linear&softmax
small CIFAR-10 model
cell
cell
cell
sep.conv3x3
image
conv3x3/2
conv3x3/2
sep.conv3x3/2
Imag
cell
cell
cell
Figure 2: Image classification models construc
Top-left: small model used during architectu
model used for learned cell evaluation. Bottom
better
better
Exploring the design space of deep networks
0.2 0.4 0.6 0.8 1 1.2
Inference time [h]
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Validationerror
Default
Pareto
optimal
subset, and evaluated on the validation subset
the selected cell is plugged into a large mode
validation sub-sets, and the accuracy is report
is never used for model selection, and it is only
cells, learned on CIFAR-10, in a large-scale se
image
sep.conv3x3/2
sep.conv3x3/2
sep.conv3x3
conv3x3
globalpool
linear&softmax
small CIFAR-10 model
cell
cell
cell
sep.conv3x3
image
conv3x3/2
conv3x3/2
sep.conv3x3/2
Imag
cell
cell
cell
Figure 2: Image classification models construc
Top-left: small model used during architectu
model used for learned cell evaluation. Bottom
better
better
Exploring the design space of deep networks
0.2 0.4 0.6 0.8 1 1.2
Inference time [h]
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Validationerror
Default
Pareto
optimal
subset, and evaluated on the validation subset
the selected cell is plugged into a large mode
validation sub-sets, and the accuracy is report
is never used for model selection, and it is only
cells, learned on CIFAR-10, in a large-scale se
image
sep.conv3x3/2
sep.conv3x3/2
sep.conv3x3
conv3x3
globalpool
linear&softmax
small CIFAR-10 model
cell
cell
cell
sep.conv3x3
image
conv3x3/2
conv3x3/2
sep.conv3x3/2
Imag
cell
cell
cell
Figure 2: Image classification models construc
Top-left: small model used during architectu
model used for learned cell evaluation. Bottom
Architecture search is carried out entirely on the CIFAR-10 training set, which w
sub-sets of 40K training and 10K validation images. Candidate models are trained
subset, and evaluated on the validation subset to obtain the fitness. Once the search
the selected cell is plugged into a large model which is trained on the combination
validation sub-sets, and the accuracy is reported on the CIFAR-10 test set. We note
is never used for model selection, and it is only used for final model evaluation. We
cells, learned on CIFAR-10, in a large-scale setting on the ImageNet challenge data
image
sep.conv3x3/2
sep.conv3x3/2
sep.conv3x3
conv3x3
globalpool
linear&softmax
small CIFAR-10 model
cell
cell
cell
sep.conv3x3/2
sep.conv3x3
sep.conv3x3/2
sep.conv3x3
image
conv3x3
large CIFAR-10 model
cell
cell
cell
cell
cell
sep.conv3x3/2
sep.conv3x3
sep.conv3x3/2
sep.conv3x3
sep.conv3x3
sep.conv3x3
image
conv3x3/2
conv3x3/2
sep.conv3x3/2
globalpool
linear&softmax
ImageNet model
cell
cell
cell
cell
cell
cell
cell
Figure 2: Image classification models constructed using the cells optimized with arc
Top-left: small model used during architecture search on CIFAR-10. Top-right:
better
better
Exploring the design space of deep networks
0.2 0.4 0.6 0.8 1 1.2
Inference time [h]
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Validationerror
Default
Pareto
optimal
subset, and evaluated on the validation subset
the selected cell is plugged into a large mode
validation sub-sets, and the accuracy is report
is never used for model selection, and it is only
cells, learned on CIFAR-10, in a large-scale se
image
sep.conv3x3/2
sep.conv3x3/2
sep.conv3x3
conv3x3
globalpool
linear&softmax
small CIFAR-10 model
cell
cell
cell
sep.conv3x3
image
conv3x3/2
conv3x3/2
sep.conv3x3/2
Imag
cell
cell
cell
Figure 2: Image classification models construc
Top-left: small model used during architectu
model used for learned cell evaluation. Bottom
validation subset to obtain the fitness. Once the search process is over,
nto a large model which is trained on the combination of training and
ccuracy is reported on the CIFAR-10 test set. We note that the test set
tion, and it is only used for final model evaluation. We also evaluate the
in a large-scale setting on the ImageNet challenge dataset (Sect. 4.3).
globalpool
linear&softmax
sep.conv3x3/2
sep.conv3x3
sep.conv3x3/2
sep.conv3x3
sep.conv3x3
sep.conv3x3
image
conv3x3
globalpool
linear&softmax
large CIFAR-10 model
cell
cell
cell
cell
cell
cell
sep.conv3x3/2
sep.conv3x3
sep.conv3x3/2
sep.conv3x3
sep.conv3x3
sep.conv3x3
sep.conv3x3/2
globalpool
linear&softmax
ImageNet model
cell
cell
cell
cell
cell
cell
n models constructed using the cells optimized with architecture search.
during architecture search on CIFAR-10. Top-right: large CIFAR-10
valuation. Bottom: ImageNet model used for learned cell evaluation.
Architecture search is carried out entirely on the CIFAR-10 training set, which w
sub-sets of 40K training and 10K validation images. Candidate models are trained
subset, and evaluated on the validation subset to obtain the fitness. Once the search
the selected cell is plugged into a large model which is trained on the combination
validation sub-sets, and the accuracy is reported on the CIFAR-10 test set. We note
is never used for model selection, and it is only used for final model evaluation. We
cells, learned on CIFAR-10, in a large-scale setting on the ImageNet challenge data
image
sep.conv3x3/2
sep.conv3x3/2
sep.conv3x3
conv3x3
globalpool
linear&softmax
small CIFAR-10 model
cell
cell
cell
sep.conv3x3/2
sep.conv3x3
sep.conv3x3/2
sep.conv3x3
image
conv3x3
large CIFAR-10 model
cell
cell
cell
cell
cell
sep.conv3x3/2
sep.conv3x3
sep.conv3x3/2
sep.conv3x3
sep.conv3x3
sep.conv3x3
image
conv3x3/2
conv3x3/2
sep.conv3x3/2
globalpool
linear&softmax
ImageNet model
cell
cell
cell
cell
cell
cell
cell
Figure 2: Image classification models constructed using the cells optimized with arc
Top-left: small model used during architecture search on CIFAR-10. Top-right:
better
better
ML pipeline
[Jeff Smith’s book on Reactive ML]
Architectural tradeoffs for production-ready ML
systems (Uber)
Architectural tradeoffs for production-ready ML
systems (Uber)
Architectural tradeoffs for production-ready ML
systems (Uber)
More info?
https://tinyurl.com/y97u68cr
Configuration errors are prevalent
Configuration errors are prevalent
Configuration errors are prevalent
Configuration errors are prevalent
Configuration errors are prevalent
Configuration errors are prevalent
Configuration errors are prevalent
Configuration errors are prevalent
Configuration errors are prevalent
Configuration complexity and dependencies between
options is a major source of configuration errors
Default
Operational context I
Operational context II
Operational context III
Configuration complexity and dependencies between
options is a major source of configuration errors
Configuration Options
Apache Hadoop Architecture
Configuration complexity and dependencies between
options is a major source of configuration errors
Configuration Options
Apache Hadoop Architecture
Associated to
Configuration complexity and dependencies between
options is a major source of configuration errors
Configuration Options
Apache Hadoop Architecture
Associated to
Configuration complexity and dependencies between
options is a major source of configuration errors
Configuration Options
Apache Hadoop Architecture
Associated to
Configurations are software too
Configuration errors are common
69%
31%
Configuration
Error
Other
Errors
cobalt.io
We can find the repair patches faster with a lighter
workload
• [localization]: Using transfer learning to derive the most
likely configurations that manifest the bugs
We can find the repair patches faster with a lighter
workload
• [localization]: Using transfer learning to derive the most
likely configurations that manifest the bugs
• [repair]: Automatically prepare patches to fix the
configuration bug
Thanks
Architectural Tradeoff in Learning-Based Software
Architectural Tradeoff in Learning-Based Software
Architectural Tradeoff in Learning-Based Software
Architectural Tradeoff in Learning-Based Software
Architectural Tradeoff in Learning-Based Software

More Related Content

Similar to Architectural Tradeoff in Learning-Based Software

Php interview-questions and answers
Php interview-questions and answersPhp interview-questions and answers
Php interview-questions and answers
sheibansari
 
Deploy and Destroy: Testing Environments - Michael Arenzon - DevOpsDays Tel A...
Deploy and Destroy: Testing Environments - Michael Arenzon - DevOpsDays Tel A...Deploy and Destroy: Testing Environments - Michael Arenzon - DevOpsDays Tel A...
Deploy and Destroy: Testing Environments - Michael Arenzon - DevOpsDays Tel A...
DevOpsDays Tel Aviv
 
Intro to Machine Learning with TF- workshop
Intro to Machine Learning with TF- workshopIntro to Machine Learning with TF- workshop
Intro to Machine Learning with TF- workshop
Prottay Karim
 
Akka with Scala
Akka with ScalaAkka with Scala
Akka with Scala
Oto Brglez
 
Deep Learning in a nutshell
Deep Learning in a nutshellDeep Learning in a nutshell
Deep Learning in a nutshell
HopeBay Technologies, Inc.
 
Elasticsearch sur Azure : Make sense of your (BIG) data !
Elasticsearch sur Azure : Make sense of your (BIG) data !Elasticsearch sur Azure : Make sense of your (BIG) data !
Elasticsearch sur Azure : Make sense of your (BIG) data !
Microsoft
 
Protect Your Payloads: Modern Keying Techniques
Protect Your Payloads: Modern Keying TechniquesProtect Your Payloads: Modern Keying Techniques
Protect Your Payloads: Modern Keying Techniques
Leo Loobeek
 
Powershell Training
Powershell TrainingPowershell Training
Powershell Training
Fahad Noaman
 
React.js Basics - ConvergeSE 2015
React.js Basics - ConvergeSE 2015React.js Basics - ConvergeSE 2015
React.js Basics - ConvergeSE 2015
Robert Pearce
 
Building Reactive Systems with Akka (in Java 8 or Scala)
Building Reactive Systems with Akka (in Java 8 or Scala)Building Reactive Systems with Akka (in Java 8 or Scala)
Building Reactive Systems with Akka (in Java 8 or Scala)
Jonas Bonér
 
Python and EM CLI: The Enterprise Management Super Tools
Python and EM CLI: The Enterprise Management Super ToolsPython and EM CLI: The Enterprise Management Super Tools
Python and EM CLI: The Enterprise Management Super Tools
Seth Miller
 
SenchaCon 2016: Building a Faceted Catalog of Video Game Assets Using Ext JS ...
SenchaCon 2016: Building a Faceted Catalog of Video Game Assets Using Ext JS ...SenchaCon 2016: Building a Faceted Catalog of Video Game Assets Using Ext JS ...
SenchaCon 2016: Building a Faceted Catalog of Video Game Assets Using Ext JS ...
Sencha
 
Computer notes - data structures
Computer notes - data structuresComputer notes - data structures
Computer notes - data structures
ecomputernotes
 
Modern, Scalable, Ambitious apps with Ember.js
Modern, Scalable, Ambitious apps with Ember.jsModern, Scalable, Ambitious apps with Ember.js
Modern, Scalable, Ambitious apps with Ember.js
Mike North
 
Using Azure Cognitive Services with NativeScript
Using Azure Cognitive Services with NativeScriptUsing Azure Cognitive Services with NativeScript
Using Azure Cognitive Services with NativeScript
Sherry List
 
CrossLanguageSpotter: A Library for Detecting Relations in Polyglot Frameworks
CrossLanguageSpotter: A Library for Detecting Relations in Polyglot FrameworksCrossLanguageSpotter: A Library for Detecting Relations in Polyglot Frameworks
CrossLanguageSpotter: A Library for Detecting Relations in Polyglot Frameworks
Giuseppe Rizzo
 
Objective-C Crash Course for Web Developers
Objective-C Crash Course for Web DevelopersObjective-C Crash Course for Web Developers
Objective-C Crash Course for Web Developers
Joris Verbogt
 
Introduction to PowerShell
Introduction to PowerShellIntroduction to PowerShell
Introduction to PowerShell
Boulos Dib
 
He stopped using for/while loops, you won't believe what happened next!
He stopped using for/while loops, you won't believe what happened next!He stopped using for/while loops, you won't believe what happened next!
He stopped using for/while loops, you won't believe what happened next!
François-Guillaume Ribreau
 
Cassandra & puppet, scaling data at $15 per month
Cassandra & puppet, scaling data at $15 per monthCassandra & puppet, scaling data at $15 per month
Cassandra & puppet, scaling data at $15 per month
daveconnors
 

Similar to Architectural Tradeoff in Learning-Based Software (20)

Php interview-questions and answers
Php interview-questions and answersPhp interview-questions and answers
Php interview-questions and answers
 
Deploy and Destroy: Testing Environments - Michael Arenzon - DevOpsDays Tel A...
Deploy and Destroy: Testing Environments - Michael Arenzon - DevOpsDays Tel A...Deploy and Destroy: Testing Environments - Michael Arenzon - DevOpsDays Tel A...
Deploy and Destroy: Testing Environments - Michael Arenzon - DevOpsDays Tel A...
 
Intro to Machine Learning with TF- workshop
Intro to Machine Learning with TF- workshopIntro to Machine Learning with TF- workshop
Intro to Machine Learning with TF- workshop
 
Akka with Scala
Akka with ScalaAkka with Scala
Akka with Scala
 
Deep Learning in a nutshell
Deep Learning in a nutshellDeep Learning in a nutshell
Deep Learning in a nutshell
 
Elasticsearch sur Azure : Make sense of your (BIG) data !
Elasticsearch sur Azure : Make sense of your (BIG) data !Elasticsearch sur Azure : Make sense of your (BIG) data !
Elasticsearch sur Azure : Make sense of your (BIG) data !
 
Protect Your Payloads: Modern Keying Techniques
Protect Your Payloads: Modern Keying TechniquesProtect Your Payloads: Modern Keying Techniques
Protect Your Payloads: Modern Keying Techniques
 
Powershell Training
Powershell TrainingPowershell Training
Powershell Training
 
React.js Basics - ConvergeSE 2015
React.js Basics - ConvergeSE 2015React.js Basics - ConvergeSE 2015
React.js Basics - ConvergeSE 2015
 
Building Reactive Systems with Akka (in Java 8 or Scala)
Building Reactive Systems with Akka (in Java 8 or Scala)Building Reactive Systems with Akka (in Java 8 or Scala)
Building Reactive Systems with Akka (in Java 8 or Scala)
 
Python and EM CLI: The Enterprise Management Super Tools
Python and EM CLI: The Enterprise Management Super ToolsPython and EM CLI: The Enterprise Management Super Tools
Python and EM CLI: The Enterprise Management Super Tools
 
SenchaCon 2016: Building a Faceted Catalog of Video Game Assets Using Ext JS ...
SenchaCon 2016: Building a Faceted Catalog of Video Game Assets Using Ext JS ...SenchaCon 2016: Building a Faceted Catalog of Video Game Assets Using Ext JS ...
SenchaCon 2016: Building a Faceted Catalog of Video Game Assets Using Ext JS ...
 
Computer notes - data structures
Computer notes - data structuresComputer notes - data structures
Computer notes - data structures
 
Modern, Scalable, Ambitious apps with Ember.js
Modern, Scalable, Ambitious apps with Ember.jsModern, Scalable, Ambitious apps with Ember.js
Modern, Scalable, Ambitious apps with Ember.js
 
Using Azure Cognitive Services with NativeScript
Using Azure Cognitive Services with NativeScriptUsing Azure Cognitive Services with NativeScript
Using Azure Cognitive Services with NativeScript
 
CrossLanguageSpotter: A Library for Detecting Relations in Polyglot Frameworks
CrossLanguageSpotter: A Library for Detecting Relations in Polyglot FrameworksCrossLanguageSpotter: A Library for Detecting Relations in Polyglot Frameworks
CrossLanguageSpotter: A Library for Detecting Relations in Polyglot Frameworks
 
Objective-C Crash Course for Web Developers
Objective-C Crash Course for Web DevelopersObjective-C Crash Course for Web Developers
Objective-C Crash Course for Web Developers
 
Introduction to PowerShell
Introduction to PowerShellIntroduction to PowerShell
Introduction to PowerShell
 
He stopped using for/while loops, you won't believe what happened next!
He stopped using for/while loops, you won't believe what happened next!He stopped using for/while loops, you won't believe what happened next!
He stopped using for/while loops, you won't believe what happened next!
 
Cassandra & puppet, scaling data at $15 per month
Cassandra & puppet, scaling data at $15 per monthCassandra & puppet, scaling data at $15 per month
Cassandra & puppet, scaling data at $15 per month
 

More from Pooyan Jamshidi

Learning LWF Chain Graphs: A Markov Blanket Discovery Approach
Learning LWF Chain Graphs: A Markov Blanket Discovery ApproachLearning LWF Chain Graphs: A Markov Blanket Discovery Approach
Learning LWF Chain Graphs: A Markov Blanket Discovery Approach
Pooyan Jamshidi
 
A Framework for Robust Control of Uncertainty in Self-Adaptive Software Conn...
 A Framework for Robust Control of Uncertainty in Self-Adaptive Software Conn... A Framework for Robust Control of Uncertainty in Self-Adaptive Software Conn...
A Framework for Robust Control of Uncertainty in Self-Adaptive Software Conn...
Pooyan Jamshidi
 
Machine Learning Meets Quantitative Planning: Enabling Self-Adaptation in Aut...
Machine Learning Meets Quantitative Planning: Enabling Self-Adaptation in Aut...Machine Learning Meets Quantitative Planning: Enabling Self-Adaptation in Aut...
Machine Learning Meets Quantitative Planning: Enabling Self-Adaptation in Aut...
Pooyan Jamshidi
 
Ensembles of Many Diverse Weak Defenses can be Strong: Defending Deep Neural ...
Ensembles of Many Diverse Weak Defenses can be Strong: Defending Deep Neural ...Ensembles of Many Diverse Weak Defenses can be Strong: Defending Deep Neural ...
Ensembles of Many Diverse Weak Defenses can be Strong: Defending Deep Neural ...
Pooyan Jamshidi
 
Transfer Learning for Performance Analysis of Configurable Systems: A Causal ...
Transfer Learning for Performance Analysis of Configurable Systems:A Causal ...Transfer Learning for Performance Analysis of Configurable Systems:A Causal ...
Transfer Learning for Performance Analysis of Configurable Systems: A Causal ...
Pooyan Jamshidi
 
Learning to Sample
Learning to SampleLearning to Sample
Learning to Sample
Pooyan Jamshidi
 
Integrated Model Discovery and Self-Adaptation of Robots
Integrated Model Discovery and Self-Adaptation of RobotsIntegrated Model Discovery and Self-Adaptation of Robots
Integrated Model Discovery and Self-Adaptation of Robots
Pooyan Jamshidi
 
Production-Ready Machine Learning for the Software Architect
Production-Ready Machine Learning for the Software ArchitectProduction-Ready Machine Learning for the Software Architect
Production-Ready Machine Learning for the Software Architect
Pooyan Jamshidi
 
Transfer Learning for Software Performance Analysis: An Exploratory Analysis
Transfer Learning for Software Performance Analysis: An Exploratory AnalysisTransfer Learning for Software Performance Analysis: An Exploratory Analysis
Transfer Learning for Software Performance Analysis: An Exploratory Analysis
Pooyan Jamshidi
 
Architecting for Scale
Architecting for ScaleArchitecting for Scale
Architecting for Scale
Pooyan Jamshidi
 
Learning Software Performance Models for Dynamic and Uncertain Environments
Learning Software Performance Models for Dynamic and Uncertain EnvironmentsLearning Software Performance Models for Dynamic and Uncertain Environments
Learning Software Performance Models for Dynamic and Uncertain Environments
Pooyan Jamshidi
 
Sensitivity Analysis for Building Adaptive Robotic Software
Sensitivity Analysis for Building Adaptive Robotic SoftwareSensitivity Analysis for Building Adaptive Robotic Software
Sensitivity Analysis for Building Adaptive Robotic Software
Pooyan Jamshidi
 
Transfer Learning for Improving Model Predictions in Highly Configurable Soft...
Transfer Learning for Improving Model Predictions in Highly Configurable Soft...Transfer Learning for Improving Model Predictions in Highly Configurable Soft...
Transfer Learning for Improving Model Predictions in Highly Configurable Soft...
Pooyan Jamshidi
 
Transfer Learning for Improving Model Predictions in Robotic Systems
Transfer Learning for Improving Model Predictions  in Robotic SystemsTransfer Learning for Improving Model Predictions  in Robotic Systems
Transfer Learning for Improving Model Predictions in Robotic Systems
Pooyan Jamshidi
 
Machine Learning meets DevOps
Machine Learning meets DevOpsMachine Learning meets DevOps
Machine Learning meets DevOps
Pooyan Jamshidi
 
An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing S...
An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing S...An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing S...
An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing S...
Pooyan Jamshidi
 
Configuration Optimization Tool
Configuration Optimization ToolConfiguration Optimization Tool
Configuration Optimization Tool
Pooyan Jamshidi
 
Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Ar...
Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Ar...Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Ar...
Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Ar...
Pooyan Jamshidi
 
Configuration Optimization for Big Data Software
Configuration Optimization for Big Data SoftwareConfiguration Optimization for Big Data Software
Configuration Optimization for Big Data Software
Pooyan Jamshidi
 
Microservices Architecture Enables DevOps: Migration to a Cloud-Native Archit...
Microservices Architecture Enables DevOps: Migration to a Cloud-Native Archit...Microservices Architecture Enables DevOps: Migration to a Cloud-Native Archit...
Microservices Architecture Enables DevOps: Migration to a Cloud-Native Archit...
Pooyan Jamshidi
 

More from Pooyan Jamshidi (20)

Learning LWF Chain Graphs: A Markov Blanket Discovery Approach
Learning LWF Chain Graphs: A Markov Blanket Discovery ApproachLearning LWF Chain Graphs: A Markov Blanket Discovery Approach
Learning LWF Chain Graphs: A Markov Blanket Discovery Approach
 
A Framework for Robust Control of Uncertainty in Self-Adaptive Software Conn...
 A Framework for Robust Control of Uncertainty in Self-Adaptive Software Conn... A Framework for Robust Control of Uncertainty in Self-Adaptive Software Conn...
A Framework for Robust Control of Uncertainty in Self-Adaptive Software Conn...
 
Machine Learning Meets Quantitative Planning: Enabling Self-Adaptation in Aut...
Machine Learning Meets Quantitative Planning: Enabling Self-Adaptation in Aut...Machine Learning Meets Quantitative Planning: Enabling Self-Adaptation in Aut...
Machine Learning Meets Quantitative Planning: Enabling Self-Adaptation in Aut...
 
Ensembles of Many Diverse Weak Defenses can be Strong: Defending Deep Neural ...
Ensembles of Many Diverse Weak Defenses can be Strong: Defending Deep Neural ...Ensembles of Many Diverse Weak Defenses can be Strong: Defending Deep Neural ...
Ensembles of Many Diverse Weak Defenses can be Strong: Defending Deep Neural ...
 
Transfer Learning for Performance Analysis of Configurable Systems: A Causal ...
Transfer Learning for Performance Analysis of Configurable Systems:A Causal ...Transfer Learning for Performance Analysis of Configurable Systems:A Causal ...
Transfer Learning for Performance Analysis of Configurable Systems: A Causal ...
 
Learning to Sample
Learning to SampleLearning to Sample
Learning to Sample
 
Integrated Model Discovery and Self-Adaptation of Robots
Integrated Model Discovery and Self-Adaptation of RobotsIntegrated Model Discovery and Self-Adaptation of Robots
Integrated Model Discovery and Self-Adaptation of Robots
 
Production-Ready Machine Learning for the Software Architect
Production-Ready Machine Learning for the Software ArchitectProduction-Ready Machine Learning for the Software Architect
Production-Ready Machine Learning for the Software Architect
 
Transfer Learning for Software Performance Analysis: An Exploratory Analysis
Transfer Learning for Software Performance Analysis: An Exploratory AnalysisTransfer Learning for Software Performance Analysis: An Exploratory Analysis
Transfer Learning for Software Performance Analysis: An Exploratory Analysis
 
Architecting for Scale
Architecting for ScaleArchitecting for Scale
Architecting for Scale
 
Learning Software Performance Models for Dynamic and Uncertain Environments
Learning Software Performance Models for Dynamic and Uncertain EnvironmentsLearning Software Performance Models for Dynamic and Uncertain Environments
Learning Software Performance Models for Dynamic and Uncertain Environments
 
Sensitivity Analysis for Building Adaptive Robotic Software
Sensitivity Analysis for Building Adaptive Robotic SoftwareSensitivity Analysis for Building Adaptive Robotic Software
Sensitivity Analysis for Building Adaptive Robotic Software
 
Transfer Learning for Improving Model Predictions in Highly Configurable Soft...
Transfer Learning for Improving Model Predictions in Highly Configurable Soft...Transfer Learning for Improving Model Predictions in Highly Configurable Soft...
Transfer Learning for Improving Model Predictions in Highly Configurable Soft...
 
Transfer Learning for Improving Model Predictions in Robotic Systems
Transfer Learning for Improving Model Predictions  in Robotic SystemsTransfer Learning for Improving Model Predictions  in Robotic Systems
Transfer Learning for Improving Model Predictions in Robotic Systems
 
Machine Learning meets DevOps
Machine Learning meets DevOpsMachine Learning meets DevOps
Machine Learning meets DevOps
 
An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing S...
An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing S...An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing S...
An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing S...
 
Configuration Optimization Tool
Configuration Optimization ToolConfiguration Optimization Tool
Configuration Optimization Tool
 
Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Ar...
Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Ar...Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Ar...
Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Ar...
 
Configuration Optimization for Big Data Software
Configuration Optimization for Big Data SoftwareConfiguration Optimization for Big Data Software
Configuration Optimization for Big Data Software
 
Microservices Architecture Enables DevOps: Migration to a Cloud-Native Archit...
Microservices Architecture Enables DevOps: Migration to a Cloud-Native Archit...Microservices Architecture Enables DevOps: Migration to a Cloud-Native Archit...
Microservices Architecture Enables DevOps: Migration to a Cloud-Native Archit...
 

Recently uploaded

fiscal year variant fiscal year variant.
fiscal year variant fiscal year variant.fiscal year variant fiscal year variant.
fiscal year variant fiscal year variant.
AnkitaPandya11
 
GreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-JurisicGreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-Jurisic
Green Software Development
 
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsUI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
Peter Muessig
 
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
XfilesPro
 
Liberarsi dai framework con i Web Component.pptx
Liberarsi dai framework con i Web Component.pptxLiberarsi dai framework con i Web Component.pptx
Liberarsi dai framework con i Web Component.pptx
Massimo Artizzu
 
在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样
在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样
在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样
mz5nrf0n
 
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian CompaniesE-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
Quickdice ERP
 
14 th Edition of International conference on computer vision
14 th Edition of International conference on computer vision14 th Edition of International conference on computer vision
14 th Edition of International conference on computer vision
ShulagnaSarkar2
 
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
kalichargn70th171
 
Oracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptxOracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptx
Remote DBA Services
 
Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)
Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)
Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)
safelyiotech
 
Oracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptxOracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptx
Remote DBA Services
 
Preparing Non - Technical Founders for Engaging a Tech Agency
Preparing Non - Technical Founders for Engaging  a  Tech AgencyPreparing Non - Technical Founders for Engaging  a  Tech Agency
Preparing Non - Technical Founders for Engaging a Tech Agency
ISH Technologies
 
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdfTop Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
VALiNTRY360
 
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
gapen1
 
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
dakas1
 
UI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
UI5con 2024 - Keynote: Latest News about UI5 and it’s EcosystemUI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
UI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
Peter Muessig
 
一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理
dakas1
 
WWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders AustinWWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders Austin
Patrick Weigel
 
Using Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query PerformanceUsing Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query Performance
Grant Fritchey
 

Recently uploaded (20)

fiscal year variant fiscal year variant.
fiscal year variant fiscal year variant.fiscal year variant fiscal year variant.
fiscal year variant fiscal year variant.
 
GreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-JurisicGreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-Jurisic
 
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsUI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
 
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
 
Liberarsi dai framework con i Web Component.pptx
Liberarsi dai framework con i Web Component.pptxLiberarsi dai framework con i Web Component.pptx
Liberarsi dai framework con i Web Component.pptx
 
在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样
在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样
在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样
 
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian CompaniesE-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
 
14 th Edition of International conference on computer vision
14 th Edition of International conference on computer vision14 th Edition of International conference on computer vision
14 th Edition of International conference on computer vision
 
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
 
Oracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptxOracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptx
 
Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)
Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)
Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)
 
Oracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptxOracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptx
 
Preparing Non - Technical Founders for Engaging a Tech Agency
Preparing Non - Technical Founders for Engaging  a  Tech AgencyPreparing Non - Technical Founders for Engaging  a  Tech Agency
Preparing Non - Technical Founders for Engaging a Tech Agency
 
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdfTop Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
 
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
 
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
 
UI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
UI5con 2024 - Keynote: Latest News about UI5 and it’s EcosystemUI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
UI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
 
一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理
 
WWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders AustinWWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders Austin
 
Using Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query PerformanceUsing Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query Performance
 

Architectural Tradeoff in Learning-Based Software

  • 1. Architectural Tradeoff in Learning-Based Software Pooyan Jamshidi Carnegie Mellon University cs.cmu.edu/~pjamshid SATURN 2018 Texas, May 8th, 2018
  • 5. I hang around here! Software Arch. Machine Learning Distributed Systems
  • 6.
  • 7.
  • 8.
  • 9. Goal: Enable developers/users to find the right quality tradeoff
  • 10. Today’s most popular (ML) systems are configurable built
  • 11. Today’s most popular (ML) systems are configurable built
  • 12. Today’s most popular (ML) systems are configurable built
  • 13. Today’s most popular (ML) systems are configurable built
  • 14. Today’s most popular (ML) systems are configurable built
  • 15.
  • 16.
  • 17.
  • 18.
  • 19. Empirical observations confirm that systems are becoming increasingly configurable 08 7/2010 7/2012 7/2014 Release time 1/1999 1/2003 1/2007 1/2011 0 1/2014 N Release time 02 1/2006 1/2010 1/2014 2.2.14 2.3.4 2.0.35 .3.24 Release time Apache 1/2006 1/2008 1/2010 1/2012 1/2014 0 40 80 120 160 200 2.0.0 1.0.0 0.19.0 0.1.0 Hadoop Numberofparameters Release time MapReduce HDFS [Tianyin Xu, et al., “Too Many Knobs…”, FSE’15]
  • 20. Empirical observations confirm that systems are becoming increasingly configurable nia San Diego, ‡Huazhong Univ. of Science & Technology, †NetApp, Inc tixu, longjin, xuf001, yyzhou}@cs.ucsd.edu kar.Pasupathy, Rukma.Talwadker}@netapp.com prevalent, but also severely software. One fundamental y of configuration, reflected parameters (“knobs”). With software to ensure high re- aunting, error-prone task. nderstanding a fundamental users really need so many answer, we study the con- including thousands of cus- m (Storage-A), and hundreds ce system software projects. ng findings to motivate soft- ore cautious and disciplined these findings, we provide ich can significantly reduce A as an example, the guide- ters and simplify 19.7% of on existing users. Also, we tion methods in the context 7/2006 7/2008 7/2010 7/2012 7/2014 0 100 200 300 400 500 600 700 Storage-A Numberofparameters Release time 1/1999 1/2003 1/2007 1/2011 0 100 200 300 400 500 5.6.2 5.5.0 5.0.16 5.1.3 4.1.0 4.0.12 3.23.0 1/2014 MySQL Numberofparameters Release time 1/1998 1/2002 1/2006 1/2010 1/2014 0 100 200 300 400 500 600 1.3.14 2.2.14 2.3.4 2.0.35 1.3.24 Numberofparameters Release time Apache 1/2006 1/2008 1/2010 1/2012 1/2014 0 40 80 120 160 200 2.0.0 1.0.0 0.19.0 0.1.0 Hadoop Numberofparameters Release time MapReduce HDFS [Tianyin Xu, et al., “Too Many Knobs…”, FSE’15]
  • 21. “all constants should be configurable, even if we can’t see any reason to configure them.” — HDFS-4304
  • 22. Configurations determine the performance behavior void Parrot_setenv(. . . name,. . . value){ #ifdef PARROT_HAS_SETENV my_setenv(name, value, 1); #else int name_len=strlen(name); int val_len=strlen(value); char* envs=glob_env; if(envs==NULL){ return; } strcpy(envs,name); strcpy(envs+name_len,"="); strcpy(envs+name_len + 1,value); putenv(envs); #endif } #ifdef LINUX extern int Parrot_signbit(double x){ endif else PARROT_HAS_SETENV LINUX
  • 23. Configurations determine the performance behavior void Parrot_setenv(. . . name,. . . value){ #ifdef PARROT_HAS_SETENV my_setenv(name, value, 1); #else int name_len=strlen(name); int val_len=strlen(value); char* envs=glob_env; if(envs==NULL){ return; } strcpy(envs,name); strcpy(envs+name_len,"="); strcpy(envs+name_len + 1,value); putenv(envs); #endif } #ifdef LINUX extern int Parrot_signbit(double x){ endif else PARROT_HAS_SETENV LINUX
  • 24. Configurations determine the performance behavior void Parrot_setenv(. . . name,. . . value){ #ifdef PARROT_HAS_SETENV my_setenv(name, value, 1); #else int name_len=strlen(name); int val_len=strlen(value); char* envs=glob_env; if(envs==NULL){ return; } strcpy(envs,name); strcpy(envs+name_len,"="); strcpy(envs+name_len + 1,value); putenv(envs); #endif } #ifdef LINUX extern int Parrot_signbit(double x){ endif else PARROT_HAS_SETENV LINUX
  • 25. Configurations determine the performance behavior void Parrot_setenv(. . . name,. . . value){ #ifdef PARROT_HAS_SETENV my_setenv(name, value, 1); #else int name_len=strlen(name); int val_len=strlen(value); char* envs=glob_env; if(envs==NULL){ return; } strcpy(envs,name); strcpy(envs+name_len,"="); strcpy(envs+name_len + 1,value); putenv(envs); #endif } #ifdef LINUX extern int Parrot_signbit(double x){ endif else PARROT_HAS_SETENV LINUX
  • 26. Configurations determine the performance behavior void Parrot_setenv(. . . name,. . . value){ #ifdef PARROT_HAS_SETENV my_setenv(name, value, 1); #else int name_len=strlen(name); int val_len=strlen(value); char* envs=glob_env; if(envs==NULL){ return; } strcpy(envs,name); strcpy(envs+name_len,"="); strcpy(envs+name_len + 1,value); putenv(envs); #endif } #ifdef LINUX extern int Parrot_signbit(double x){ endif else PARROT_HAS_SETENV LINUX Speed Energy
  • 27. Configurations determine the performance behavior void Parrot_setenv(. . . name,. . . value){ #ifdef PARROT_HAS_SETENV my_setenv(name, value, 1); #else int name_len=strlen(name); int val_len=strlen(value); char* envs=glob_env; if(envs==NULL){ return; } strcpy(envs,name); strcpy(envs+name_len,"="); strcpy(envs+name_len + 1,value); putenv(envs); #endif } #ifdef LINUX extern int Parrot_signbit(double x){ endif else PARROT_HAS_SETENV LINUX Speed Energy
  • 28. Configurations determine the performance behavior void Parrot_setenv(. . . name,. . . value){ #ifdef PARROT_HAS_SETENV my_setenv(name, value, 1); #else int name_len=strlen(name); int val_len=strlen(value); char* envs=glob_env; if(envs==NULL){ return; } strcpy(envs,name); strcpy(envs+name_len,"="); strcpy(envs+name_len + 1,value); putenv(envs); #endif } #ifdef LINUX extern int Parrot_signbit(double x){ endif else PARROT_HAS_SETENV LINUX Speed Energy
  • 29. How do we understand performance behavior of real-world highly-configurable systems that scale well…
  • 30. How do we understand performance behavior of real-world highly-configurable systems that scale well… … and enable developers/users to reason about qualities (performance, energy) and to make tradeoff?
  • 36. SocialSensor •Identifying trending topics •Identifying user defined topics •Social media search
  • 40. SocialSensor Orchestrator Crawling Tweets: [5k-20k/min] Every 10 min: [100k tweets] Store Crawled items FetchInternet
  • 41. SocialSensor Content AnalysisOrchestrator Crawling Tweets: [5k-20k/min] Every 10 min: [100k tweets] Store Push Crawled items FetchInternet
  • 42. SocialSensor Content AnalysisOrchestrator Crawling Tweets: [5k-20k/min] Every 10 min: [100k tweets] Tweets: [10M] Store Push Store Crawled items FetchInternet
  • 43. SocialSensor Content AnalysisOrchestrator Crawling Search and Integration Tweets: [5k-20k/min] Every 10 min: [100k tweets] Tweets: [10M] Fetch Store Push Store Crawled items FetchInternet
  • 44. Challenges Content AnalysisOrchestrator Crawling Search and Integration Tweets: [5k-20k/min] Every 10 min: [100k tweets] Tweets: [10M] Fetch Store Push Store Crawled items FetchInternet
  • 45. Challenges Content AnalysisOrchestrator Crawling Search and Integration Tweets: [5k-20k/min] Every 10 min: [100k tweets] Tweets: [10M] Fetch Store Push Store Crawled items FetchInternet 100X
  • 46. Challenges Content AnalysisOrchestrator Crawling Search and Integration Tweets: [5k-20k/min] Every 10 min: [100k tweets] Tweets: [10M] Fetch Store Push Store Crawled items FetchInternet 100X 10X
  • 47. Challenges Content AnalysisOrchestrator Crawling Search and Integration Tweets: [5k-20k/min] Every 10 min: [100k tweets] Tweets: [10M] Fetch Store Push Store Crawled items FetchInternet 100X 10X Real time
  • 48. How can we gain a better performance without using more resources?
  • 49. Let’s try out different system configurations!
  • 50. Opportunity: Data processing engines in the pipeline were all configurable
  • 51. Opportunity: Data processing engines in the pipeline were all configurable
  • 52. Opportunity: Data processing engines in the pipeline were all configurable
  • 53. Opportunity: Data processing engines in the pipeline were all configurable
  • 54. Opportunity: Data processing engines in the pipeline were all configurable > 100 > 100 > 100
  • 55. Opportunity: Data processing engines in the pipeline were all configurable > 100 > 100 > 100 2300
  • 56.
  • 57. 0 500 1000 1500 Throughput (ops/sec) 0 1000 2000 3000 4000 5000 Averagewritelatency(s) Default configuration was bad, so was the expert’ better better
  • 58. 0 500 1000 1500 Throughput (ops/sec) 0 1000 2000 3000 4000 5000 Averagewritelatency(s) Default configuration was bad, so was the expert’ Defaultbetter better
  • 59. 0 500 1000 1500 Throughput (ops/sec) 0 1000 2000 3000 4000 5000 Averagewritelatency(s) Default configuration was bad, so was the expert’ Default Recommended by an expert better better
  • 60. 0 500 1000 1500 Throughput (ops/sec) 0 1000 2000 3000 4000 5000 Averagewritelatency(s) Default configuration was bad, so was the expert’ Default Recommended by an expert Optimal Configuration better better
  • 61. 0 0.5 1 1.5 2 2.5 Throughput (ops/sec) 10 4 0 50 100 150 200 250 300 Latency(ms) Default configuration was bad, so was the expert’ better better
  • 62. 0 0.5 1 1.5 2 2.5 Throughput (ops/sec) 10 4 0 50 100 150 200 250 300 Latency(ms) Default configuration was bad, so was the expert’ Default better better
  • 63. 0 0.5 1 1.5 2 2.5 Throughput (ops/sec) 10 4 0 50 100 150 200 250 300 Latency(ms) Default configuration was bad, so was the expert’ Default Recommended by an expert better better
  • 64. 0 0.5 1 1.5 2 2.5 Throughput (ops/sec) 10 4 0 50 100 150 200 250 300 Latency(ms) Default configuration was bad, so was the expert’ Default Recommended by an expert Optimal Configuration better better
  • 65. Why this is an important problem?
  • 66. Why this is an important problem? Significant time saving
  • 67. Why this is an important problem? Significant time saving • 2X-10X faster than worst
  • 68. Why this is an important problem? Significant time saving • 2X-10X faster than worst • Noticeably faster than median
  • 69. Why this is an important problem? Significant time saving • 2X-10X faster than worst • Noticeably faster than median • Default is bad
  • 70. Why this is an important problem? Significant time saving • 2X-10X faster than worst • Noticeably faster than median • Default is bad • Expert’s is not optimal
  • 71. Why this is an important problem? Significant time saving • 2X-10X faster than worst • Noticeably faster than median • Default is bad • Expert’s is not optimal
  • 72. Why this is an important problem? Significant time saving • 2X-10X faster than worst • Noticeably faster than median • Default is bad • Expert’s is not optimal Large configuration space
  • 73. Why this is an important problem? Significant time saving • 2X-10X faster than worst • Noticeably faster than median • Default is bad • Expert’s is not optimal Large configuration space • Exhaustive search is expensive
  • 74. Why this is an important problem? Significant time saving • 2X-10X faster than worst • Noticeably faster than median • Default is bad • Expert’s is not optimal Large configuration space • Exhaustive search is expensive • Specific to hardware/workload/version
  • 75. What did happen at the end? • Achieved the objectives (100X user, same experience) • Saved money by reducing cloud resources up to 20% • Our tool was able to identify configurations that was consistently better than expert recommendation
  • 77. To enable performance tradeoff, we need a model to reason about qualities void Parrot_setenv(. . . name,. . . value){ #ifdef PARROT_HAS_SETENV my_setenv(name, value, 1); #else int name_len=strlen(name); int val_len=strlen(value); char* envs=glob_env; if(envs==NULL){ return; } strcpy(envs,name); strcpy(envs+name_len,"="); strcpy(envs+name_len + 1,value); putenv(envs); #endif } #ifdef LINUX endif else PARROT_HAS_SETENV LINUX f(·) = 5 + 3 ⇥ o1 Execution time (s)
  • 78. To enable performance tradeoff, we need a model to reason about qualities void Parrot_setenv(. . . name,. . . value){ #ifdef PARROT_HAS_SETENV my_setenv(name, value, 1); #else int name_len=strlen(name); int val_len=strlen(value); char* envs=glob_env; if(envs==NULL){ return; } strcpy(envs,name); strcpy(envs+name_len,"="); strcpy(envs+name_len + 1,value); putenv(envs); #endif } #ifdef LINUX endif else PARROT_HAS_SETENV LINUX f(·) = 5 + 3 ⇥ o1 Execution time (s) f(o1 := 1) = 8
  • 79. To enable performance tradeoff, we need a model to reason about qualities void Parrot_setenv(. . . name,. . . value){ #ifdef PARROT_HAS_SETENV my_setenv(name, value, 1); #else int name_len=strlen(name); int val_len=strlen(value); char* envs=glob_env; if(envs==NULL){ return; } strcpy(envs,name); strcpy(envs+name_len,"="); strcpy(envs+name_len + 1,value); putenv(envs); #endif } #ifdef LINUX endif else PARROT_HAS_SETENV LINUX f(·) = 5 + 3 ⇥ o1 Execution time (s) f(o1 := 0) = 5 f(o1 := 1) = 8
  • 80. What is a performance model? f : C ! R f(o1, o2) = 5 + 3o1 + 15o2 7o1 ⇥ o2 c =< o1, o2 >
  • 81. What is a performance model? f : C ! R f(o1, o2) = 5 + 3o1 + 15o2 7o1 ⇥ o2 c =< o1, o2 >
  • 82. What is a performance model? f : C ! R f(o1, o2) = 5 + 3o1 + 15o2 7o1 ⇥ o2 c =< o1, o2 >
  • 83. What is a performance model? f : C ! R f(o1, o2) = 5 + 3o1 + 15o2 7o1 ⇥ o2 c =< o1, o2 > c =< o1, o2, ..., o10 >
  • 84. What is a performance model? f : C ! R f(o1, o2) = 5 + 3o1 + 15o2 7o1 ⇥ o2 c =< o1, o2 > c =< o1, o2, ..., o10 > c =< o1, o2, ..., o100 > ···
  • 85. How do we learn performance models? TurtleBot
  • 86. How do we learn performance models? Measure TurtleBot Configurations
  • 87. How do we learn performance models? Measure Learn TurtleBot Configurations f(o1, o2) = 5 + 3o1 + 15o2 7o1 ⇥ o2
  • 88. How do we learn performance models? Measure Learn TurtleBot Optimization Reasoning Debugging Configurations f(o1, o2) = 5 + 3o1 + 15o2 7o1 ⇥ o2
  • 89. Insight: Performance measurements of the real system is “similar” to the ones from the simulators Measure Configurations Performance
  • 90. Insight: Performance measurements of the real system is “similar” to the ones from the simulators Measure Configurations Performance
  • 91. Insight: Performance measurements of the real system is “similar” to the ones from the simulators Measure Simulator (Gazebo) Data Configurations Performance So why not reuse these data, instead of measuring on real robot?
  • 92. We developed methods to make learning cheaper via transfer learning Target (Learn)Source (Given) DataModel Transferable Knowledge II. INTUITION Understanding the performance behavior of configurable software systems can enable (i) performance debugging, (ii) performance tuning, (iii) design-time evolution, or (iv) runtime adaptation [11]. We lack empirical understanding of how the performance behavior of a system will vary when the environ- ment of the system changes. Such empirical understanding will provide important insights to develop faster and more accurate learning techniques that allow us to make predictions and optimizations of performance for highly configurable systems in changing environments [10]. For instance, we can learn performance behavior of a system on a cheap hardware in a controlled lab environment and use that to understand the per- formance behavior of the system on a production server before shipping to the end user. More specifically, we would like to know, what the relationship is between the performance of a system in a specific environment (characterized by software configuration, hardware, workload, and system version) to the one that we vary its environmental conditions. In this research, we aim for an empirical understanding of A. Preliminary concept In this section, we p cepts that we use throu enable us to concisely 1) Configuration and the i-th feature of a co enabled or disabled an configuration space is m all the features C = Dom(Fi) = {0, 1}. A a member of the confi all the parameters are range (i.e., complete ins We also describe an e = [w, h, v] drawn fr W ⇥H ⇥V , where they values for workload, ha 2) Performance mod configuration space F formance model is a b given some observation combination of system II. INTUITION standing the performance behavior of configurable systems can enable (i) performance debugging, (ii) ance tuning, (iii) design-time evolution, or (iv) runtime on [11]. We lack empirical understanding of how the ance behavior of a system will vary when the environ- the system changes. Such empirical understanding will important insights to develop faster and more accurate techniques that allow us to make predictions and tions of performance for highly configurable systems ging environments [10]. For instance, we can learn ance behavior of a system on a cheap hardware in a ed lab environment and use that to understand the per- e behavior of the system on a production server before to the end user. More specifically, we would like to hat the relationship is between the performance of a n a specific environment (characterized by software ation, hardware, workload, and system version) to the we vary its environmental conditions. s research, we aim for an empirical understanding of A. Preliminary concepts In this section, we provide formal definitions of cepts that we use throughout this study. The forma enable us to concisely convey concept throughout 1) Configuration and environment space: Let F the i-th feature of a configurable system A whic enabled or disabled and one of them holds by de configuration space is mathematically a Cartesian all the features C = Dom(F1) ⇥ · · · ⇥ Dom(F Dom(Fi) = {0, 1}. A configuration of a syste a member of the configuration space (feature spa all the parameters are assigned to a specific valu range (i.e., complete instantiations of the system’s pa We also describe an environment instance by 3 e = [w, h, v] drawn from a given environment s W ⇥H ⇥V , where they respectively represent sets values for workload, hardware and system version. 2) Performance model: Given a software syste configuration space F and environmental instances formance model is a black-box function f : F ⇥ given some observations of the system performanc combination of system’s features x 2 F in an en formance model is a black-box function f : F ⇥ E ! R given some observations of the system performance for each combination of system’s features x 2 F in an environment e 2 E. To construct a performance model for a system A with configuration space F, we run A in environment instance e 2 E on various combinations of configurations xi 2 F, and record the resulting performance values yi = f(xi) + ✏i, xi 2 F where ✏i ⇠ N (0, i). The training data for our regression models is then simply Dtr = {(xi, yi)}n i=1. In other words, a response function is simply a mapping from the input space to a measurable performance metric that produces interval-scaled data (here we assume it produces real numbers). 3) Performance distribution: For the performance model, we measured and associated the performance response to each configuration, now let introduce another concept where we vary the environment and we measure the performance. An empirical performance distribution is a stochastic process, pd : E ! (R), that defines a probability distribution over performance measures for each environmental conditions. To tem version) to the ons. al understanding of g via an informed learning a perfor- ed on a well-suited the knowledge we the main research information (trans- o both source and ore can be carried r. This transferable [10]. s that we consider uration is a set of s the primary vari- rstand performance ike to understand der study will be formance model is a black-box function f : F ⇥ E given some observations of the system performance fo combination of system’s features x 2 F in an enviro e 2 E. To construct a performance model for a syst with configuration space F, we run A in environment in e 2 E on various combinations of configurations xi 2 F record the resulting performance values yi = f(xi) + ✏i F where ✏i ⇠ N (0, i). The training data for our regr models is then simply Dtr = {(xi, yi)}n i=1. In other wo response function is simply a mapping from the input sp a measurable performance metric that produces interval- data (here we assume it produces real numbers). 3) Performance distribution: For the performance m we measured and associated the performance response t configuration, now let introduce another concept whe vary the environment and we measure the performanc empirical performance distribution is a stochastic pr pd : E ! (R), that defines a probability distributio performance measures for each environmental conditio Extract Reuse Learn Learn Goal: Gain strength by transferring information across environments
  • 93. TurtleBot [P. Jamshidi, et al., “Transfer learning for improving model predictions ….”, SEAMS’17] Our transfer learning solution
  • 94. TurtleBot Simulator (Gazebo) [P. Jamshidi, et al., “Transfer learning for improving model predictions ….”, SEAMS’17] Our transfer learning solution
  • 95. Data Measure TurtleBot Simulator (Gazebo) [P. Jamshidi, et al., “Transfer learning for improving model predictions ….”, SEAMS’17] Our transfer learning solution
  • 96. DataData Data Measure Measure Reuse TurtleBot Simulator (Gazebo) [P. Jamshidi, et al., “Transfer learning for improving model predictions ….”, SEAMS’17] Configurations Our transfer learning solution
  • 97. DataData Data Measure Measure Reuse Learn TurtleBot Simulator (Gazebo) [P. Jamshidi, et al., “Transfer learning for improving model predictions ….”, SEAMS’17] Configurations Our transfer learning solution f(o1, o2) = 5 + 3o1 + 15o2 7o1 ⇥ o2
  • 98. Gaussian processes for performance modeling t = n output,f(x) input, x
  • 99. Gaussian processes for performance modeling t = n Observation output,f(x) input, x
  • 100. Gaussian processes for performance modeling t = n Observation Mean output,f(x) input, x
  • 101. Gaussian processes for performance modeling t = n Observation Mean Uncertainty output,f(x) input, x
  • 102. Gaussian processes for performance modeling t = n Observation Mean Uncertainty New observation output,f(x) input, x
  • 103. input, x Gaussian processes for performance modeling t = n t = n + 1 Observation Mean Uncertainty New observation output,f(x) input, x
  • 104. Gaussian Processes enables reasoning about performance Step 1: Fit GP to the data seen so far Step 2: Explore the model for regions of most variance Step 3: Sample that region Step 4: Repeat -1.5 -1 -0.5 0 0.5 1 1.5 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 Configuration Space Empirical Model Experiment Experiment 0 20 40 60 80 100 120 140 160 180 200 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 Selection Criteria Sequential Design
  • 107. CoBot experiment: DARPA BRASS 0 2 4 6 8 Localization error [m] 10 15 20 25 30 35 40 CPUutilization[%] Pareto front better better no_of_particles=x no_of_refinement=y
  • 108. CoBot experiment: DARPA BRASS 0 2 4 6 8 Localization error [m] 10 15 20 25 30 35 40 CPUutilization[%] Energy constraint Pareto front better better no_of_particles=x no_of_refinement=y
  • 109. CoBot experiment: DARPA BRASS 0 2 4 6 8 Localization error [m] 10 15 20 25 30 35 40 CPUutilization[%] Energy constraint Safety constraint Pareto front better better no_of_particles=x no_of_refinement=y
  • 110. CoBot experiment: DARPA BRASS 0 2 4 6 8 Localization error [m] 10 15 20 25 30 35 40 CPUutilization[%] Energy constraint Safety constraint Pareto front Sweet Spot better better no_of_particles=x no_of_refinement=y
  • 112. CoBot experiment 5 10 15 20 25 5 10 15 20 25 0 5 10 15 20 25 Source (given) CPU [%]
  • 113. CoBot experiment 5 10 15 20 25 5 10 15 20 25 0 5 10 15 20 25 5 10 15 20 25 5 10 15 20 25 0 5 10 15 20 25 Source (given) Target (ground truth 6 months) CPU [%] CPU [%]
  • 114. CoBot experiment 5 10 15 20 25 5 10 15 20 25 0 5 10 15 20 25 5 10 15 20 25 5 10 15 20 25 0 5 10 15 20 25 5 10 15 20 25 5 10 15 20 25 0 5 10 15 20 25 Source (given) Target (ground truth 6 months) Prediction with 4 samples CPU [%] CPU [%]
  • 115. CoBot experiment 5 10 15 20 25 5 10 15 20 25 0 5 10 15 20 25 5 10 15 20 25 5 10 15 20 25 0 5 10 15 20 25 5 10 15 20 25 5 10 15 20 25 0 5 10 15 20 25 5 10 15 20 25 5 10 15 20 25 0 5 10 15 20 25 Source (given) Target (ground truth 6 months) Prediction with 4 samples Prediction with Transfer learning CPU [%] CPU [%]
  • 116. Results: Other configurable systems CoBot WordCount SOL RollingSort Cassandra (HW) Cassandra (DB)
  • 117. Transfer Learning for Improving Model Predictions in Highly Configurable Software Pooyan Jamshidi, Miguel Velez, Christian K¨astner Carnegie Mellon University, USA {pjamshid,mvelezce,kaestner}@cs.cmu.edu Norbert Siegmund Bauhaus-University Weimar, Germany norbert.siegmund@uni-weimar.de Prasad Kawthekar Stanford University, USA pkawthek@stanford.edu Abstract—Modern software systems are built to be used in dynamic environments using configuration capabilities to adapt to changes and external uncertainties. In a self-adaptation context, we are often interested in reasoning about the performance of the systems under different configurations. Usually, we learn a black-box model based on real measurements to predict the performance of the system given a specific configuration. However, as modern systems become more complex, there are many configuration parameters that may interact and we end up learning an exponentially large configuration space. Naturally, this does not scale when relying on real measurements in the actual changing environment. We propose a different solution: Instead of taking the measurements from the real system, we learn the model using samples from other sources, such as simulators that approximate performance of the real system at Predictive Model Learn Model with Transfer Learning Measure Measure Data Source Target Simulator (Source) Robot (Target) Adaptation Fig. 1: Transfer learning for performance model learning. order to identify the best performing configuration for a robot Details: [SEAMS ’17]
  • 118. Summary (transfer learning) 5 10 15 20 25 5 10 15 20 25 0 5 10 15 20 25 5 10 15 20 25 5 10 15 20 25 0 5 10 15 20 25 5 10 15 20 25 5 10 15 20 25 0 5 10 15 20 25 5 10 15 20 25 5 10 15 20 25 0 5 10 15 20 25
  • 119. Summary (transfer learning) • Model for making tradeoff between qualities 5 10 15 20 25 5 10 15 20 25 0 5 10 15 20 25 5 10 15 20 25 5 10 15 20 25 0 5 10 15 20 25 5 10 15 20 25 5 10 15 20 25 0 5 10 15 20 25 5 10 15 20 25 5 10 15 20 25 0 5 10 15 20 25
  • 120. Summary (transfer learning) • Model for making tradeoff between qualities • Scale to large space and environmental changes 5 10 15 20 25 5 10 15 20 25 0 5 10 15 20 25 5 10 15 20 25 5 10 15 20 25 0 5 10 15 20 25 5 10 15 20 25 5 10 15 20 25 0 5 10 15 20 25 5 10 15 20 25 5 10 15 20 25 0 5 10 15 20 25
  • 121. Summary (transfer learning) • Model for making tradeoff between qualities • Scale to large space and environmental changes • Transfer learning can help 5 10 15 20 25 5 10 15 20 25 0 5 10 15 20 25 5 10 15 20 25 5 10 15 20 25 0 5 10 15 20 25 5 10 15 20 25 5 10 15 20 25 0 5 10 15 20 25 5 10 15 20 25 5 10 15 20 25 0 5 10 15 20 25
  • 122. Summary (transfer learning) • Model for making tradeoff between qualities • Scale to large space and environmental changes • Transfer learning can help • Increase prediction accuracy 5 10 15 20 25 5 10 15 20 25 0 5 10 15 20 25 5 10 15 20 25 5 10 15 20 25 0 5 10 15 20 25 5 10 15 20 25 5 10 15 20 25 0 5 10 15 20 25 5 10 15 20 25 5 10 15 20 25 0 5 10 15 20 25
  • 123. Summary (transfer learning) • Model for making tradeoff between qualities • Scale to large space and environmental changes • Transfer learning can help • Increase prediction accuracy • Increase model reliability 5 10 15 20 25 5 10 15 20 25 0 5 10 15 20 25 5 10 15 20 25 5 10 15 20 25 0 5 10 15 20 25 5 10 15 20 25 5 10 15 20 25 0 5 10 15 20 25 5 10 15 20 25 5 10 15 20 25 0 5 10 15 20 25
  • 124. Summary (transfer learning) • Model for making tradeoff between qualities • Scale to large space and environmental changes • Transfer learning can help • Increase prediction accuracy • Increase model reliability • Decrease model building cost 5 10 15 20 25 5 10 15 20 25 0 5 10 15 20 25 5 10 15 20 25 5 10 15 20 25 0 5 10 15 20 25 5 10 15 20 25 5 10 15 20 25 0 5 10 15 20 25 5 10 15 20 25 5 10 15 20 25 0 5 10 15 20 25
  • 126. Looking further: When transfer learning goes wrong 10 20 30 40 50 60 AbsolutePercentageError[%] Sources s s1 s2 s3 s4 s5 s6 noise-level 0 5 10 15 20 25 30 corr. coeff. 0.98 0.95 0.89 0.75 0.54 0.34 0.19 µ(pe) 15.34 14.14 17.09 18.71 33.06 40.93 46.75 Insight: Predictions become more accurate when the source is more related to the target. Non-transfer-learning
  • 127. Looking further: When transfer learning goes wrong 10 20 30 40 50 60 AbsolutePercentageError[%] Sources s s1 s2 s3 s4 s5 s6 noise-level 0 5 10 15 20 25 30 corr. coeff. 0.98 0.95 0.89 0.75 0.54 0.34 0.19 µ(pe) 15.34 14.14 17.09 18.71 33.06 40.93 46.75 It worked! Insight: Predictions become more accurate when the source is more related to the target. Non-transfer-learning
  • 128. Looking further: When transfer learning goes wrong 10 20 30 40 50 60 AbsolutePercentageError[%] Sources s s1 s2 s3 s4 s5 s6 noise-level 0 5 10 15 20 25 30 corr. coeff. 0.98 0.95 0.89 0.75 0.54 0.34 0.19 µ(pe) 15.34 14.14 17.09 18.71 33.06 40.93 46.75 It worked! It didn’t! Insight: Predictions become more accurate when the source is more related to the target. Non-transfer-learning
  • 129. Our empirical study: We looked at different highly- configurable systems to gain insights [P. Jamshidi, et al., “Transfer learning for performance modeling of configurable systems….”, ASE’17] SPEAR (SAT Solver) Analysis time 14 options 16,384 configurations SAT problems 3 hardware 2 versions X264 (video encoder) Encoding time 16 options 4,000 configurations Video quality/size 2 hardware 3 versions SQLite (DB engine) Query time 14 options 1,000 configurations DB Queries 2 hardware 2 versions SaC (Compiler) Execution time 50 options 71,267 configurations 10 Demo programs
  • 130. Observation 1: Linear shift happens only in limited environmental changes Soft Environmental change Severity Corr. SPEAR NUC/2 -> NUC/4 Small 1.00 Amazon_nano -> NUC Large 0.59 Hardware/workload/version V Large -0.10 x264 Version Large 0.06 Workload Medium 0.65 SQLite write-seq -> write-batch Small 0.96 read-rand -> read-seq Medium 0.50 Target Source Throughput Implication: Simple transfer learning is limited to hardware changes in practice log P(θ, Xobs ) Θ l P(θ|Xobs) Θ Figure 5: The first column shows the log joint probab
  • 131. Observation 1: Linear shift happens only in limited environmental changes Soft Environmental change Severity Corr. SPEAR NUC/2 -> NUC/4 Small 1.00 Amazon_nano -> NUC Large 0.59 Hardware/workload/version V Large -0.10 x264 Version Large 0.06 Workload Medium 0.65 SQLite write-seq -> write-batch Small 0.96 read-rand -> read-seq Medium 0.50 Target Source Throughput Implication: Simple transfer learning is limited to hardware changes in practice log P(θ, Xobs ) Θ l P(θ|Xobs) Θ Figure 5: The first column shows the log joint probab
  • 132. Observation 1: Linear shift happens only in limited environmental changes Soft Environmental change Severity Corr. SPEAR NUC/2 -> NUC/4 Small 1.00 Amazon_nano -> NUC Large 0.59 Hardware/workload/version V Large -0.10 x264 Version Large 0.06 Workload Medium 0.65 SQLite write-seq -> write-batch Small 0.96 read-rand -> read-seq Medium 0.50 Target Source Throughput Implication: Simple transfer learning is limited to hardware changes in practice log P(θ, Xobs ) Θ l P(θ|Xobs) Θ Figure 5: The first column shows the log joint probab
  • 133. Observation 1: Linear shift happens only in limited environmental changes Soft Environmental change Severity Corr. SPEAR NUC/2 -> NUC/4 Small 1.00 Amazon_nano -> NUC Large 0.59 Hardware/workload/version V Large -0.10 x264 Version Large 0.06 Workload Medium 0.65 SQLite write-seq -> write-batch Small 0.96 read-rand -> read-seq Medium 0.50 Target Source Throughput Implication: Simple transfer learning is limited to hardware changes in practice log P(θ, Xobs ) Θ l P(θ|Xobs) Θ Figure 5: The first column shows the log joint probab
  • 134. Soft Environmental change Severity Dim t-test x264 Version Large 16 12 10 Hardware/workload/ver V Large 8 9 SQLite write-seq -> write-batch V Large 14 3 4 read-rand -> read-seq Medium 1 1 SaC Workload V Large 50 16 10 Implication: Avoid wasting budget on non-informative part of configuration space and focusing where it matters. Observation 2: Influential options and interactions are preserved across environments
  • 135. Soft Environmental change Severity Dim t-test x264 Version Large 16 12 10 Hardware/workload/ver V Large 8 9 SQLite write-seq -> write-batch V Large 14 3 4 read-rand -> read-seq Medium 1 1 SaC Workload V Large 50 16 10 Implication: Avoid wasting budget on non-informative part of configuration space and focusing where it matters. Observation 2: Influential options and interactions are preserved across environments
  • 136. Soft Environmental change Severity Dim t-test x264 Version Large 16 12 10 Hardware/workload/ver V Large 8 9 SQLite write-seq -> write-batch V Large 14 3 4 read-rand -> read-seq Medium 1 1 SaC Workload V Large 50 16 10 Implication: Avoid wasting budget on non-informative part of configuration space and focusing where it matters. Observation 2: Influential options and interactions are preserved across environments
  • 137. Soft Environmental change Severity Dim t-test x264 Version Large 16 12 10 Hardware/workload/ver V Large 8 9 SQLite write-seq -> write-batch V Large 14 3 4 read-rand -> read-seq Medium 1 1 SaC Workload V Large 50 16 10 Implication: Avoid wasting budget on non-informative part of configuration space and focusing where it matters. Observation 2: Influential options and interactions are preserved across environments 216 250 = 0.000000000058 We only need to explore part of the space:
  • 138. Transfer Learning for Performance Modeling of Configurable Systems: An Exploratory Analysis Pooyan Jamshidi Carnegie Mellon University, USA Norbert Siegmund Bauhaus-University Weimar, Germany Miguel Velez, Christian K¨astner Akshay Patel, Yuvraj Agarwal Carnegie Mellon University, USA Abstract—Modern software systems provide many configura- tion options which significantly influence their non-functional properties. To understand and predict the effect of configuration options, several sampling and learning strategies have been proposed, albeit often with significant cost to cover the highly dimensional configuration space. Recently, transfer learning has been applied to reduce the effort of constructing performance models by transferring knowledge about performance behavior across environments. While this line of research is promising to learn more accurate models at a lower cost, it is unclear why and when transfer learning works for performance modeling. To shed light on when it is beneficial to apply transfer learning, we conducted an empirical study on four popular software systems, varying software configurations and environmental conditions, such as hardware, workload, and software versions, to identify the key knowledge pieces that can be exploited for transfer learning. Our results show that in small environmental changes (e.g., homogeneous workload change), by applying a linear transformation to the performance model, we can understand the performance behavior of the target environment, while for severe environmental changes (e.g., drastic workload change) we can transfer only knowledge that makes sampling more efficient, e.g., by reducing the dimensionality of the configuration space. Index Terms—Performance analysis, transfer learning. Fig. 1: Transfer learning is a form of machine learning that takes advantage of transferable knowledge from source to learn an accurate, reliable, and less costly model for the target environment. their byproducts across environments is demanded by many Details: [ASE ’17]
  • 140. What will the software systems of the future look like?
  • 141. Software 2.0 Increasingly customized and configurable VISION
  • 142. Software 2.0 Increasingly customized and configurable VISION
  • 143. Software 2.0 Increasingly customized and configurable VISION
  • 144. Software 2.0 Increasingly customized and configurable VISION
  • 145. Software 2.0 Increasingly customized and configurable VISION Increasingly competing objectives Accuracy Training speed Inference speed Model size Energy
  • 146. Deep neural network as a highly configurable system trans1 N: 40, op: [1, 1, 1, 0], groups: [4, 2, 2, X] ks: [[3, 5], [5, 7], [3, 3], X] d: [[2, 2], [1, 1], [3, 1], X] N: 16, op: [1, 1, 0, 0], groups: [2, 1, X, X] ks: [[5, 5], [3, 5], X, X] d: [[1, 1], [3, 1], X, X] N: 48, op: [1, 1, 1, 1], groups: [1, 1, 1, 4] ks: [[5, 5], [7, 3], [3, 3], [3, 7]] d: [[1, 1], [1, 3], [1, 2], [3, 1]] N: 16, op: [1, 1, 1, 0], groups: [1, 4, 4, X] ks: [[7, 3], [3, 5], [5, 5], X] d: [[1, 1], [2, 1], [2, 2], X] N: 16, op: [1, 1, 1, 1], groups: [4, 4, 2, 4] ks: [[7, 7], [5, 3], [3, 5], [3, 5]] d: [[1, 1], [1, 2], [3, 1], [1, 1]] N: 8, op: [1, 1, 1, 1], groups: [4, 1, 1, 2] ks: [[5, 3], [3, 5], [7, 5], [5, 5]] d: [[2, 1], [3, 2], [1, 2], [1, 2]] N: 48, op: [1, 1, 1, 1], groups: [1, 4, 4, 2] ks: [[7, 3], [7, 7], [5, 5], [5, 7]] d: [[1, 2], [1, 1], [1, 1], [1, 1]] N: 16, op: [1, 1, 0, 0], groups: [2, 2, X, X] ks: [[5, 7], [5, 3], X, X] d: [[2, 2], [1, 2], X, X] N: 56, op: [1, 0, 0, 0], groups: [2, X, X, X] ks: [[3, 7], X, X, X] d: [[1, 1], X, X, X] N: 48, op: [1, 1, 1, 0], groups: [4, 2, 1, X] ks: [[3, 5], [7, 5], [7, 5], X] d: [[3, 2], [1, 2], [1, 2], X] N: 56, op: [1, 0, 0, 0], groups: [1, X, X, X] ks: [[5, 3], X, X, X] d: [[2, 1], X, X, X] + N: 64, op: [1, 1, 1, 1], groups: [1, 1, 4, 2] ks: [[5, 5], [5, 3], [5, 3], [7, 5]] d: [[2, 1], [1, 1], [2, 1], [1, 1]] + N: 32, op: [1, 1, 0, 0], groups: [2, 2, X, X] ks: [[7, 5], [3, 7], X, X] d: [[1, 2], [3, 1], X, X] N: 32, op: [1, 1, 1, 0], groups: [2, 2, 1, X] ks: [[7, 3], [7, 7], [3, 7], X] d: [[1, 3], [1, 1], [3, 1], X] N: 40, op: [1, 1, 0, 0], groups: [2, 4, X, X] ks: [[3, 7], [5, 7], X, X] d: [[1, 1], [1, 1], X, X] + + + N: 8, op: [1, 1, 1, 1], groups: [4, 2, 1, 2] ks: [[3, 7], [7, 5], [3, 3], [3, 7]] d: [[3, 1], [1, 1], [2, 1], [2, 1]] +++ + N: 16, op: [1, 1, 1, 0], groups: [1, 4, 4, X] ks: [[7, 5], [3, 7], [5, 7], X] d: [[1, 1], [1, 1], [1, 1], X] ++ N: 56, op: [1, 1, 1, 1], groups: [1, 4, 1, 1] ks: [[7, 5], [3, 3], [3, 3], [3, 3]] d: [[1, 1], [1, 2], [2, 1], [3, 1]] N: 40, op: [1, 1, 1, 1], groups: [1, 4, 4, 2] ks: [[3, 3], [3, 5], [5, 3], [3, 3]] d: [[1, 3], [1, 1], [1, 3], [3, 2]] + N: 24, op: [1, 1, 1, 1], groups: [2, 1, 1, 2] ks: [[7, 3], [3, 3], [7, 3], [7, 5]] d: [[1, 2], [2, 3], [1, 3], [1, 1]] N: 32, op: [1, 1, 1, 1], groups: [1, 1, 1, 1] ks: [[5, 5], [5, 3], [5, 7], [7, 3]] d: [[2, 2], [1, 3], [1, 1], [1, 2]] N: 24, op: [1, 1, 1, 1], groups: [2, 1, 4, 1] ks: [[7, 7], [7, 7], [7, 3], [7, 5]] d: [[1, 1], [1, 1], [1, 3], [1, 1]] N: 40, op: [1, 1, 1, 1], groups: [1, 1, 2, 1] ks: [[7, 3], [7, 5], [5, 3], [3, 3]] d: [[1, 3], [1, 2], [2, 3], [1, 1]] + N: 24, op: [1, 1, 1, 0], groups: [4, 1, 2, X] ks: [[5, 7], [7, 3], [5, 5], X] d: [[1, 1], [1, 1], [1, 2], X] ++ + ++ + ++ ++ + + + + + + + + + + + + + + + + + ++ output ++ N: 24, op: [1, 0, 0, 0], groups: [2, X, X, X] ks: [[7, 5], X, X, X] d: [[1, 2], X, X, X] N: 56, op: [1, 1, 1, 1], groups: [4, 4, 4, 4] ks: [[7, 5], [3, 3], [3, 7], [5, 5]] d: [[1, 2], [2, 1], [3, 1], [1, 2]] + + + N: 24, op: [1, 1, 1, 0], groups: [2, 2, 4, X] ks: [[5, 5], [5, 3], [5, 5], X] d: [[1, 1], [1, 2], [1, 2], X] N: 56, op: [1, 1, 1, 1], groups: [2, 1, 1, 1] ks: [[5, 7], [7, 3], [3, 5], [7, 5]] d: [[2, 1], [1, 2], [2, 1], [1, 2]] N: 40, op: [1, 0, 0, 0], groups: [2, X, X, X] ks: [[7, 5], X, X, X] d: [[1, 2], X, X, X] N: 64, op: [1, 1, 1, 1], groups: [4, 1, 1, 1] ks: [[7, 3], [5, 3], [7, 7], [5, 7]] d: [[1, 1], [2, 2], [1, 1], [1, 1]] + + + + + + N: 56, op: [1, 1, 1, 0], groups: [4, 2, 4, X] ks: [[7, 5], [3, 7], [7, 5], X] d: [[1, 2], [3, 1], [1, 2], X] N: 32, op: [1, 1, 0, 0], groups: [1, 4, X, X] ks: [[5, 3], [3, 3], X, X] d: [[1, 2], [3, 2], X, X] + + + N: 16, op: [1, 1, 0, 0], groups: [1, 1, X, X] ks: [[5, 5], [3, 7], X, X] d: [[1, 2], [3, 1], X, X] + + + + + + + + + + ++ + ++ + + + ++ + + +
  • 147. Exploring the design space of deep networks accuracy is reported on the CIFAR-10 test set. We note that the test set tion, and it is only used for final model evaluation. We also evaluate the in a large-scale setting on the ImageNet challenge dataset (Sect. 4.3). globalpool linear&softmax sep.conv3x3/2 sep.conv3x3 sep.conv3x3/2 sep.conv3x3 sep.conv3x3 sep.conv3x3 image conv3x3 globalpool linear&softmax large CIFAR-10 model cell cell cell cell cell cell sep.conv3x3/2 sep.conv3x3 sep.conv3x3/2 sep.conv3x3 sep.conv3x3 sep.conv3x3 sep.conv3x3/2 globalpool linear&softmax ImageNet model cell cell cell cell cell cell n models constructed using the cells optimized with architecture search. during architecture search on CIFAR-10. Top-right: large CIFAR-10 Optimal Architecture (Yesterday)
  • 148. Exploring the design space of deep networks accuracy is reported on the CIFAR-10 test set. We note that the test set tion, and it is only used for final model evaluation. We also evaluate the in a large-scale setting on the ImageNet challenge dataset (Sect. 4.3). globalpool linear&softmax sep.conv3x3/2 sep.conv3x3 sep.conv3x3/2 sep.conv3x3 sep.conv3x3 sep.conv3x3 image conv3x3 globalpool linear&softmax large CIFAR-10 model cell cell cell cell cell cell sep.conv3x3/2 sep.conv3x3 sep.conv3x3/2 sep.conv3x3 sep.conv3x3 sep.conv3x3 sep.conv3x3/2 globalpool linear&softmax ImageNet model cell cell cell cell cell cell n models constructed using the cells optimized with architecture search. during architecture search on CIFAR-10. Top-right: large CIFAR-10 Optimal Architecture (Yesterday) New Fraud Pattern
  • 149. Exploring the design space of deep networks accuracy is reported on the CIFAR-10 test set. We note that the test set tion, and it is only used for final model evaluation. We also evaluate the in a large-scale setting on the ImageNet challenge dataset (Sect. 4.3). globalpool linear&softmax sep.conv3x3/2 sep.conv3x3 sep.conv3x3/2 sep.conv3x3 sep.conv3x3 sep.conv3x3 image conv3x3 globalpool linear&softmax large CIFAR-10 model cell cell cell cell cell cell sep.conv3x3/2 sep.conv3x3 sep.conv3x3/2 sep.conv3x3 sep.conv3x3 sep.conv3x3 sep.conv3x3/2 globalpool linear&softmax ImageNet model cell cell cell cell cell cell n models constructed using the cells optimized with architecture search. during architecture search on CIFAR-10. Top-right: large CIFAR-10 milar approach has recently been used in (Zoph et al., 2017; Zhong et al., 2017). rchitecture search is carried out entirely on the CIFAR-10 training set, which we split into two ub-sets of 40K training and 10K validation images. Candidate models are trained on the training ubset, and evaluated on the validation subset to obtain the fitness. Once the search process is over, e selected cell is plugged into a large model which is trained on the combination of training and alidation sub-sets, and the accuracy is reported on the CIFAR-10 test set. We note that the test set never used for model selection, and it is only used for final model evaluation. We also evaluate the ells, learned on CIFAR-10, in a large-scale setting on the ImageNet challenge dataset (Sect. 4.3). sep.conv3x3/2 sep.conv3x3/2 sep.conv3x3 conv3x3 globalpool linear&softmax small CIFAR-10 model cell cell cell sep.conv3x3/2 sep.conv3x3 sep.conv3x3/2 sep.conv3x3 sep.conv3x3 sep.conv3x3 image conv3x3 globalpool linear&softmax large CIFAR-10 model cell cell cell cell cell cell sep.conv3x3/2 sep.conv3x3 sep.conv3x3/2 sep.conv3x3 sep.conv3x3 sep.conv3x3 image conv3x3/2 conv3x3/2 sep.conv3x3/2 globalpool linear&softmax cell cell cell cell cell cell cell Optimal Architecture (Yesterday) Optimal Architecture (Today) New Fraud Pattern
  • 152. Deep architecture deployment [2.44, 0.88] [9.93, 0.77]
  • 153. To achieve the optimal results, it is important to explore: • Network architectures • Software libraries [TensorFlow, CNTK, Theano, etc] • Deployment infrastructure
  • 154. Exploring the design space of deep networks 0.2 0.4 0.6 0.8 1 1.2 Inference time [h] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Validationerror better better
  • 155. Exploring the design space of deep networks 0.2 0.4 0.6 0.8 1 1.2 Inference time [h] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Validationerror better better
  • 156. Exploring the design space of deep networks 0.2 0.4 0.6 0.8 1 1.2 Inference time [h] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Validationerror Default subset, and evaluated on the validation subset the selected cell is plugged into a large mode validation sub-sets, and the accuracy is report is never used for model selection, and it is only cells, learned on CIFAR-10, in a large-scale se image sep.conv3x3/2 sep.conv3x3/2 sep.conv3x3 conv3x3 globalpool linear&softmax small CIFAR-10 model cell cell cell sep.conv3x3 image conv3x3/2 conv3x3/2 sep.conv3x3/2 Imag cell cell cell Figure 2: Image classification models construc Top-left: small model used during architectu model used for learned cell evaluation. Bottom better better
  • 157. Exploring the design space of deep networks 0.2 0.4 0.6 0.8 1 1.2 Inference time [h] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Validationerror Default Pareto optimal subset, and evaluated on the validation subset the selected cell is plugged into a large mode validation sub-sets, and the accuracy is report is never used for model selection, and it is only cells, learned on CIFAR-10, in a large-scale se image sep.conv3x3/2 sep.conv3x3/2 sep.conv3x3 conv3x3 globalpool linear&softmax small CIFAR-10 model cell cell cell sep.conv3x3 image conv3x3/2 conv3x3/2 sep.conv3x3/2 Imag cell cell cell Figure 2: Image classification models construc Top-left: small model used during architectu model used for learned cell evaluation. Bottom better better
  • 158. Exploring the design space of deep networks 0.2 0.4 0.6 0.8 1 1.2 Inference time [h] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Validationerror Default Pareto optimal subset, and evaluated on the validation subset the selected cell is plugged into a large mode validation sub-sets, and the accuracy is report is never used for model selection, and it is only cells, learned on CIFAR-10, in a large-scale se image sep.conv3x3/2 sep.conv3x3/2 sep.conv3x3 conv3x3 globalpool linear&softmax small CIFAR-10 model cell cell cell sep.conv3x3 image conv3x3/2 conv3x3/2 sep.conv3x3/2 Imag cell cell cell Figure 2: Image classification models construc Top-left: small model used during architectu model used for learned cell evaluation. Bottom Architecture search is carried out entirely on the CIFAR-10 training set, which w sub-sets of 40K training and 10K validation images. Candidate models are trained subset, and evaluated on the validation subset to obtain the fitness. Once the search the selected cell is plugged into a large model which is trained on the combination validation sub-sets, and the accuracy is reported on the CIFAR-10 test set. We note is never used for model selection, and it is only used for final model evaluation. We cells, learned on CIFAR-10, in a large-scale setting on the ImageNet challenge data image sep.conv3x3/2 sep.conv3x3/2 sep.conv3x3 conv3x3 globalpool linear&softmax small CIFAR-10 model cell cell cell sep.conv3x3/2 sep.conv3x3 sep.conv3x3/2 sep.conv3x3 image conv3x3 large CIFAR-10 model cell cell cell cell cell sep.conv3x3/2 sep.conv3x3 sep.conv3x3/2 sep.conv3x3 sep.conv3x3 sep.conv3x3 image conv3x3/2 conv3x3/2 sep.conv3x3/2 globalpool linear&softmax ImageNet model cell cell cell cell cell cell cell Figure 2: Image classification models constructed using the cells optimized with arc Top-left: small model used during architecture search on CIFAR-10. Top-right: better better
  • 159. Exploring the design space of deep networks 0.2 0.4 0.6 0.8 1 1.2 Inference time [h] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Validationerror Default Pareto optimal subset, and evaluated on the validation subset the selected cell is plugged into a large mode validation sub-sets, and the accuracy is report is never used for model selection, and it is only cells, learned on CIFAR-10, in a large-scale se image sep.conv3x3/2 sep.conv3x3/2 sep.conv3x3 conv3x3 globalpool linear&softmax small CIFAR-10 model cell cell cell sep.conv3x3 image conv3x3/2 conv3x3/2 sep.conv3x3/2 Imag cell cell cell Figure 2: Image classification models construc Top-left: small model used during architectu model used for learned cell evaluation. Bottom validation subset to obtain the fitness. Once the search process is over, nto a large model which is trained on the combination of training and ccuracy is reported on the CIFAR-10 test set. We note that the test set tion, and it is only used for final model evaluation. We also evaluate the in a large-scale setting on the ImageNet challenge dataset (Sect. 4.3). globalpool linear&softmax sep.conv3x3/2 sep.conv3x3 sep.conv3x3/2 sep.conv3x3 sep.conv3x3 sep.conv3x3 image conv3x3 globalpool linear&softmax large CIFAR-10 model cell cell cell cell cell cell sep.conv3x3/2 sep.conv3x3 sep.conv3x3/2 sep.conv3x3 sep.conv3x3 sep.conv3x3 sep.conv3x3/2 globalpool linear&softmax ImageNet model cell cell cell cell cell cell n models constructed using the cells optimized with architecture search. during architecture search on CIFAR-10. Top-right: large CIFAR-10 valuation. Bottom: ImageNet model used for learned cell evaluation. Architecture search is carried out entirely on the CIFAR-10 training set, which w sub-sets of 40K training and 10K validation images. Candidate models are trained subset, and evaluated on the validation subset to obtain the fitness. Once the search the selected cell is plugged into a large model which is trained on the combination validation sub-sets, and the accuracy is reported on the CIFAR-10 test set. We note is never used for model selection, and it is only used for final model evaluation. We cells, learned on CIFAR-10, in a large-scale setting on the ImageNet challenge data image sep.conv3x3/2 sep.conv3x3/2 sep.conv3x3 conv3x3 globalpool linear&softmax small CIFAR-10 model cell cell cell sep.conv3x3/2 sep.conv3x3 sep.conv3x3/2 sep.conv3x3 image conv3x3 large CIFAR-10 model cell cell cell cell cell sep.conv3x3/2 sep.conv3x3 sep.conv3x3/2 sep.conv3x3 sep.conv3x3 sep.conv3x3 image conv3x3/2 conv3x3/2 sep.conv3x3/2 globalpool linear&softmax ImageNet model cell cell cell cell cell cell cell Figure 2: Image classification models constructed using the cells optimized with arc Top-left: small model used during architecture search on CIFAR-10. Top-right: better better
  • 160. ML pipeline [Jeff Smith’s book on Reactive ML]
  • 161. Architectural tradeoffs for production-ready ML systems (Uber)
  • 162. Architectural tradeoffs for production-ready ML systems (Uber)
  • 163. Architectural tradeoffs for production-ready ML systems (Uber)
  • 174. Configuration complexity and dependencies between options is a major source of configuration errors Default Operational context I Operational context II Operational context III
  • 175. Configuration complexity and dependencies between options is a major source of configuration errors Configuration Options Apache Hadoop Architecture
  • 176. Configuration complexity and dependencies between options is a major source of configuration errors Configuration Options Apache Hadoop Architecture Associated to
  • 177. Configuration complexity and dependencies between options is a major source of configuration errors Configuration Options Apache Hadoop Architecture Associated to
  • 178. Configuration complexity and dependencies between options is a major source of configuration errors Configuration Options Apache Hadoop Architecture Associated to
  • 180. Configuration errors are common 69% 31% Configuration Error Other Errors cobalt.io
  • 181. We can find the repair patches faster with a lighter workload • [localization]: Using transfer learning to derive the most likely configurations that manifest the bugs
  • 182. We can find the repair patches faster with a lighter workload • [localization]: Using transfer learning to derive the most likely configurations that manifest the bugs • [repair]: Automatically prepare patches to fix the configuration bug
  • 183. Thanks