Dancing With
Uncertainty
Saša
Misailović
Stelios
Sidiroglou
Martin
Rinard
MIT CSAIL
Example
Water: Simulates system of water molecules
HH
O
HH
O
HH
O
H
H
O
H
H
O
HH
O
HH
O
Example
Water: Simulates system of water molecules
HH
O
HH
O
HH
O
H
H
O
H
H
O
HH
O
HH
O
Example
Water: Simulates system of water molecules
HH
O
HH
O
HH
O
H
H
O
H
H
O
HH
O
HH
O
Example
Water: Simulates liquid water molecules
HH
O
HH
O
HH
O
H
H
O
H
H
O
HH
O
HH
O
Example
Water: Simulates system of water molecules
HH
O
HH
O
HH
O
H
H
O
H
H
O
HH
O
HH
O
Example
Water: Simulates system of water molecules
HH
O
HH
O
HH
O
H
H
O
H
H
O
HH
O
HH
O
Dubstep
Explores the effects of
selectively removing
synchronization
Dubstep Highlights
1. Removing locks and opportunistic
barriers
trade accuracy for performance
2. Automatically explores the tradeoff
space induced by candidate
transformations
3. Uses statistical analysis to characterize
impact of transformations on accuracy
Dubstep Workflow
Prepare
Find
Transform
Analyze
Navigate
Dubstep Workflow
Prepare
Find
Transform
Analyze
Navigate
1. Prepare representative inputs
2. Prepare accuracy model
– Output abstraction (important
parts of output)
– Accuracy bound (amount of
tolerable error)
Dubstep Workflow
Prepare
Find
Transform
Analyze
Navigate
Loops with parallel constructs
• Profiling: performance &
memory
Dubstep Workflow
Prepare
Find
Transform
Analyze
Navigate
Loops with parallel constructs
• Profiling: performance &
memory
Interf (56.4%) Poteng (43.4%)
Dubstep Workflow
Removing synchronizationPrepare
Find
Transform
Analyze
Navigate
void scratchPad::updateForces
(double R[3][3]) {
mutex_lock(this->lock);
this->H1force.vecAdd(R[0]);
this->Oforce.vecAdd(R[1]);
this->H2force.vecAdd(R[2]);
mutex_unlock(this->lock);
}
Dubstep Workflow
Removing synchronizationPrepare
Find
Transform
Analyze
Navigate
void scratchPad::updateForces
(double R[3][3]) {
mutex_lock(this->lock);
this->H1force.vecAdd(R[0]);
this->Oforce.vecAdd(R[1]);
this->H2force.vecAdd(R[2]);
mutex_unlock(this->lock);
}
Dubstep Workflow
Removing synchronizationPrepare
Find
Transform
Analyze
Navigate
void scratchPad::updateForces
(double R[3][3]) {
this->H1force.vecAdd(R[0]);
this->Oforce.vecAdd(R[1]);
this->H2force.vecAdd(R[2]);
}
Dubstep Workflow
Opportunistic barriersPrepare
Find
Transform
Analyze
Navigate
void ensemble::interf(){
parallel_for(
interf_body, 0, NumMol-1
);
}
Dubstep Workflow
Opportunistic barriersPrepare
Find
Transform
Analyze
Navigate
void ensemble::interf(){
parallel_for(
interf_body, 0, NumMol-1
);
}
Dubstep Workflow
Opportunistic barriersPrepare
Find
Transform
Analyze
Navigate
void ensemble::interf(){
parallel_for(
interf_body, 0, NumMol-1
);
}
• Schedule threads
• Execute interf_body in parallel
• Wait for all threads to complete
Dubstep Workflow
Opportunistic barriersPrepare
Find
Transform
Analyze
Navigate
void ensemble::interf(){
parallel_for*(
interf_body, 0, NumMol-1
);
}
• Schedule threads
• Execute interf_body in parallel
• Wait for half of threads to
complete
Instruct remaining threads to stop
[Rinard, OOPSLA 2007]
Dubstep Workflow
Analyze transformed program:
• Criticality
– Memory safety, integrity
• Performance
– Speedup comparison
• Accuracy
– Statistical analysis
Prepare
Find
Transform
Analyze
Navigate
Dubstep Workflow
Prepare
Find
Transform
Analyze
Navigate
c
Input
Original
Program Output
Output Abstraction
(Application-Specific)
Transforme
d
Program
Difference Bound
δ<
Dubstep Workflow
Navigate the tradeoff space:
• Transform and analyze one
location at a time
– 3 locations in water
• Transform multiple locations
in the same candidate
program
– Guided by the results of the
previous step
Prepare
Find
Transform
Analyze
Navigate
Search Space Exploration
1
1.05
1.1
1.15
1.2
1.25
0 0.02 0.04 0.06
Average Accuracy Loss vs. Speedup
LI BI
BRLI+BI
LI+BP
BI+BP
LI+BI+BP
RelativeSpeedup
Accuracy
loss
LI – Synchronization Interf
BI – Barrier Interf
BP – Barrier Poteng
Baseline: original parallel program runs
6.2 times faster than sequential on 8 cores
Search Space Exploration
1
1.05
1.1
1.15
1.2
1.25
0 0.02 0.04 0.06
Average Accuracy Loss vs. Speedup
LI BI
BRLI+BI
LI+BP
BI+BP
LI+BI+BP
RelativeSpeedup
Accuracy
loss
LI – Synchronization Interf
BI – Barrier Interf
BP – Barrier Poteng
How confident can we be
about these observations?
Baseline: original parallel program runs
6.2 times faster than sequential on 8 cores
Execution Reliability
The probability p that the transformed
program on the given input produces
the result with error less than bound δ
While we cannot model p, we can
specify minimum acceptable reliability r
Execution Reliability
Determine if program’s reliability p > r
Execution Reliability
Determine if program’s reliability p > r
How to pick N?
How Many Runs Are
Enough?
Procedure that determines that p > r :
• Returns correct result
most of the time
– Wrong decision rate 
– Tolerance region 
• Quickly determines
extreme (very good
or bad) transformations
Statistical Analysis
Sequential Probability Ratio Test
Statistical Analysis
Sequential Probability Ratio Test
Statistical Analysis
Sequential Probability Ratio Test
Bound (δ)
Best
Transformation
0.01 LI
0.05 LI
0.10 LI+BI+BR
0.15 LI+BI+BR
Statistical Analysis
Sequential Probability Ratio Test
Bound (δ)
Best
Transformation
0.01 LI
0.05 LI
0.10 LI+BI+BR
0.15 LI+BI+BR
Exploring Tradeoff Space
Start: Sequential program with
for loops
Transformations:
• Parallel loop introduction
• Synchronization, Replication
Quickstep
[MIT-TR-2010-38, TECS/PEC 2012]
Prepare
Find
Transform
Analyze
Navigate
Exploring Tradeoff Space
Start: Program with for loops
Transformations:
• Skip loop iterations
(multiple forms)
Loop Perforation
[ICSE 2010, ONWARD 2010, SAS 2011,
FSE 2011]
Prepare
Find
Transform
Analyze
Navigate
Exploring Tradeoff Space
Start: Program with command
line parameters
Transformations:
• Alternate function versions
activated by CL parameters
Dynamic Knobs
[ASPLOS 2011]
Prepare
Find
Transform
Analyze
Navigate
Exploring Tradeoff Space
Start: Program is a tree of
Map-Reduce type tasks
Transformations:
• Function Substitution
• Reduction Sampling
NapRed
[POPL 2012]
Prepare
Find
Transform
Analyze
Navigate
Exploring Tradeoff Space
Start: Parallel program with
for loops
Transformations:
• Removing Locks
• Opportunistic Barriers
Dubstep
[Today: RACES 2012]
Prepare
Find
Transform
Analyze
Navigate
Reasoning About
Accuracy
Exploring levels of
accuracy guarantees:
• Logic-based
• Probabilistic
• Statistical
• Empirical
Prepare
Find
Transform
Analyze
Navigate

Dancing with Uncertainty