2/7/2014 A. Gilgur & J. Ferrandiz. Predictive SPC 1
Alex Gilgur, Josep Ferrandiz
ASA Conference on Statistical Practice
Tampa, FL. February 2014
Outline
• SPC = Statistical Process Control
• The Fishbone of SPC
• Traditional SPC
• Six Sigma
• Predictive SPC:
– Univariate
– Multivariate
2/7/2014 A. Gilgur & J. Ferrandiz. Predictive SPC 2
2/7/2014
A. Gilgur & J. Ferrandiz. Predictive SPC
3
1968
2/7/2014 A. Gilgur & J. Ferrandiz. Predictive SPC 4
•Artificial Intelligence
•Predictive Analytics
•Data Mining
•Machine Learning
2/7/2014
A. Gilgur & J. Ferrandiz. Predictive SPC
5
SPC
Specifications
Target
What to measure?
Science / Engineering / Math
Domain
Upper Spec Limit
Lower Spec Limit
Distribution
Stationarity
Process Dynamics
Dependencies
The Fishbone of SPC
Setting Specs is an optimization problem
2/7/2014 6
p
Servers = argmax (Revenue |Budget)
Revenue = f[Throughput (Servers, SW, Budget)]
Servers = argmin (Budget | Revenue)
•Throughput = t (UX)
•Revenue = r (Throughput)
•Budget = f(SW, Servers)
Constraints:
•Domain
•Budget ≤ B
The business drives the specs
A. Gilgur & J. Ferrandiz. Predictive SPC
Specifications
2/7/2014 7
Stake-
holders
X Y
LSL, Tgt, USLX = f-1 (Y)
LSL,
Tgt,
USL
From X to Y to X
A. Gilgur & J. Ferrandiz. Predictive SPC
Domain
2/7/2014
A. Gilgur & J. Ferrandiz. Predictive SPC
8
“Knobs” to turn?
2/7/2014
A. Gilgur & J. Ferrandiz. Predictive SPC
9
Specifications
Upper Spec Limit
Target
Lower Spec Limit
Domain
Science/Engineering/Math
What to measure?
Closing the loop
Specs to adjust?
Process Dynamics
Stationarity
Distribution
Dependencies
2/7/2014
A. Gilgur & J. Ferrandiz. Predictive SPC
10
Does External Interaction Only Lead to Improvement?
2/7/2014 11
Process Dynamics: Short-term vs. Long-term
Timeline
Value(IO/sec,%Util,CPULoad,…)
Day1 Day2 Day3 … … … … … … … … ... Day(N-1) Day(N)
0 0.5 1 1.5 2 2.5 3 3.5 4
1
3
5
7
9
11
13
15
17
HIGH
LOW
Data Collection hides the true distribution
A. Gilgur & J. Ferrandiz. Predictive SPC
2/7/2014 12
Process Dynamics: Traditional SPC
μ == Target;
μ + 6*σ <= USL;
μ - 6*σ >= LSL;
μ
σ
LCL UCL
IDEAL CASE: on target; in control; within the specifications
LSL USL
Target
A. Gilgur & J. Ferrandiz. Predictive SPC
2/7/2014 13
μ
σ
LSL
USL
Target
LCL UCL
SHIFT
μ != Target;
μ + 6*σ > USL;
μ - 6*σ >= LSL;
REAL CASE # 1: off target; in control; out of the specifications
P-value
Process Dynamics: a Shift
1) Negotiate specs
2) Change the process
A. Gilgur & J. Ferrandiz. Predictive SPC
2/7/2014
Process Dynamics: a Change in Variance
14
μ != Target;
μ + 6*σ > USL;
μ - 6*σ > LSL;
REAL CASE #2: off target; out of control; out of specifications
μ
σ
LSL
USLTarget
LCL UCL
A. Gilgur & J. Ferrandiz. Predictive SPC
P-value
1) Negotiate specs
2) Change the process
2/7/2014 15
Bimodal example
A. Gilgur & J. Ferrandiz. Predictive SPC
• Cp – a measure of the process capability to produce
consistent results:
– Cp = (USL – LSL) / (6 * σ)
– Desired Cp >= 1.0
– High Cp -> “In control”
• Cpk – a measure of the process capability to produce
results that are on target:
– Cpk = Min { ( μ – LSL) / (3 * σ), (USL – μ) / (3 * σ)}
– Desired Cpk >= 1.33
– High Cpk -> “In control and On Target”
• Cpk > Cp > 1.33 -> “In Control, On Target”
• Cpk < Cp < 1.0 -> “Out of Control, Off Target”
2/7/2014 16
SPC Measures
A. Gilgur & J. Ferrandiz. Predictive SPC
2/7/2014 17
SPC Measures:
Another way to look at it
• Z –measure of process capability to produce results within specs:
– Zlower = (μ – LSL) / σ,
– Zupper = (USL – μ) / σ.
– Long-term and short-term Z:
• Typically, desired Zst* = 6
• Zlt = Zst - 1.5
Zst == 2 : 310,000 defects per 1,000,000 opportunities (69%)
Zst == 3 : 67,000 defects per 1,000,000 opportunities (93.3%)
Zst == 6 : 3.45 defects per 1,000,000 opportunities (99.999965%)
A. Gilgur & J. Ferrandiz. Predictive SPC
Cpk= (1/3) * min (Zlower , Zupper)
Statistical Process Control
2/7/2014 18
A. Gilgur & J. Ferrandiz. Predictive SPC
• SPC = Statistical Process Control
• The Fishbone of SPC
• Traditional SPC
• Six Sigma
• Predictive SPC:
– Univariate
– Multivariate
Six Sigma
2/7/2014 19
http://www.isixsigma.com/
1. Define:
i. End Goal
ii. What to measure
iii. How to measure
2. Measure:
i. Gage R&R
ii. Collect data
3. Analyze:
i. Mean? Variance? Shape? All three?
ii. Correlations
4. Improve:
i. Design and Conduct Experiments
ii. Analyze the results
5. Control:
i. SPC
ii. Education & Training
iii. $avings
A. Gilgur & J. Ferrandiz. Predictive SPC
Statistical Process Control
2/7/2014 A. Gilgur & J. Ferrandiz. Predictive SPC 20
• SPC = Statistical Process Control
• The Fishbone of SPC
• Traditional SPC
• Six Sigma
• Why is it not Good Enough?
• Predictive SPC:
– Univariate
– Multivariate
2/7/2014 A. Gilgur & J. Ferrandiz. Predictive SPC 21
Why is SPC / Six Sigma not good enough?
Rapid growth
New product introduction
Agile Development
R&D / Science
… … …
Zst == 2 : 310,000 defects per 1,000,000 opportunities (69%)
Zst == 3 : 67,000 defects per 1,000,000 opportunities (93.3%)
Zst == 6 : 3.45 defects per 1,000,000 opportunities (99.999965%)
Manufacturing
Mechanical / Chemical
Semiconductor
Food
Pharmaceutical
Power Plants
Traffic Engineering
Customer Support
… … …
2/7/2014 A. Gilgur & J. Ferrandiz. Predictive SPC 22
Why is SPC / Six Sigma not good enough?
 Normal distribution
 Stationary processes
 Well defined LSL, Tgt, USL
Something like this:
2/7/2014
A. Gilgur & J. Ferrandiz. Predictive SPC
23
Or this:
A chemical process:
•Known Specifications
•Unknown Dynamics
•Non-stationary
A Data Center:
•Unknown Specifications
•Somewhat known Dynamics
•Non-stationary
Time1. Forecast the metric
2. An alternative:
1. Observe the process
2. Is the measurement within predicted interval?
2/7/2014 24
Holt-Winters
ARIMA
Univariate Predictive SPC
… … …
A. Gilgur & J. Ferrandiz. Predictive SPC
Univariate Predictive SPC: what do we do?
Time
Time
Outlier?
•Do nothing
Start of new pattern?
•Reforecast
•The clock:
•Continue from the beginning?
•Reset at reforecast point?
When do we decide?
Process Dynamics
Stationarity
Distribution
Dependencies
A Business Model
2/7/2014
A. Gilgur & J. Ferrandiz. Predictive SPC
26
Application DB
Makes sense to use parabola …
Traffic = # of units of work (worker threads in use) in the system
Little’s Law:
Traffic = Arrival_rate * Processing_time
Arrival_Rate = a * BMI + b
Processing_time = c * Arrival_rate + d
Traffic = (a * BMI + b) * [c * (a * BMI + b) + d]
Traffic = f * BMI2 + g * BMI + h
Multivariate Predictive SPC
x = throughput
r = response time
q = # of worker threads
BM = business metric
Quadratic fit…
2/7/2014
A. Gilgur & J. Ferrandiz. Predictive SPC
27
Application DB
Check the residuals
KPM ~ BMI2 + BMI
BMI (MCPS = millions clicks per second)
KPM(Concurrency=UnitsofWorkinthesystem)
2/7/2014 A. Gilgur & J. Ferrandiz. Predictive SPC 28
Application DB
The curve fit is a-OK
Outliers
Patterns:
Top != Bulk != Bottom
Residuals…
What’s SPC got to do with it?
2/7/2014
A. Gilgur & J. Ferrandiz. Predictive SPC
29
Application DB
80 MCPS -> {290… 480} 100 MCPS -> {380… 600}
Quantile regression:
• Independent top, bulk, bottom
• A range of KPM for each slice of BM
• Robust to outliers
80 MCPS -> {290… 480} 100 MCPS -> {380 … 600}
New data arrived
2/7/2014
A. Gilgur & J. Ferrandiz. Predictive SPC
30
Application DB
80 MCPS -> {420… 700} 100 MCPS -> {540… 800}
Packets ~ qps
Packets ~ qps
Another Example: How can we use this as a method?
2/7/2014
A. Gilgur & J. Ferrandiz. Predictive SPC
31
Baseline: @ 100 qps:
M = 65 mln concurrent packets
R = (30…110) mln concurrent packets
New Data: @ 100 qps:
M = 63 mln concurrent packets
R = (30…80) mln concurrent packets
Target
(LCL…UCL)
(LSL…USL)
“Knobs” to turn?
Specifications
Domain
Specs to adjust? Process Dynamics
Multivariate Predictive SPC: the general idea
2/7/2014
A. Gilgur & J. Ferrandiz. Predictive SPC
32
Data are NOT stationary
Data are NOT normal
“Processes behind data” are understood
A general idea of “good” vs. “bad” behavior
oNO specifications is OK
Traffic = f * BMI2 + g * BMI + h
@BMI values of interest
 “Processes behind data” CAN BE DESCRIBED MATHEMATICALLY
 Closed-form
Monte-Carlo
How can we use it?
2/7/2014 A. Gilgur & J. Ferrandiz. Predictive SPC 33
Direct KPM-BMI link
ID degraded apps “Knobs” to turn?
Specifications
Domain
Specs to adjust? Process Dynamics
2/7/2014 A. Gilgur & J. Ferrandiz. Predictive SPC 34
•Artificial Intelligence
•Predictive Analytics
•Data Mining
•Machine Learning1968
2014
Predictive SPC
So, how else can we use this as a method?
2/7/2014
A. Gilgur & J. Ferrandiz. Predictive SPC
35
Business
Metric
# Work Units # Servers
How else can we
use this as a method?
2/7/2014 A. Gilgur & J. Ferrandiz. Predictive SPC 36
p
Universal
Scalability Law
# Servers
KPM
Specs
Ballast
Summary
• SPC Loop(s):
– The Why
– The What
– The How
• Six Sigma:
– DMAIC
• Next Generation:
– Predictive SPC
• Univariate Predictive SPC: Forecasting
• Multivariate Predictive SPC: Relationships among variables
– A way to stay ahead of the curve
– Domain specific
– Process agnostic
– Expandable, Flexible, and Robust
– An extension of Traditional SPC
– Successfully implemented in IT
2/7/2014 A. Gilgur & J. Ferrandiz. Predictive SPC 37
“Knobs” to turn?
Specifications
Domain
Specs to adjust? Process Dynamics
Thank you!
2/7/2014 A. Gilgur & J. Ferrandiz. Predictive SPC 38
www.isixsigma.com
www.amstat.org
www.cmg.org
www.linkedin.com
http://alexonsimanddata.blogspot.com/
http://josepferrandiz.blogspot.com/
“Statistical thinking will one day be as necessary for
efficient citizenship as the ability to read and write.”
- H.G.Wells (1866-1946)
THANK YOU
2/7/2014 A. Gilgur & J. Ferrandiz. Predictive SPC 39
Appendix
• BACKUP SLIDES
2/7/2014 A. Gilgur & J. Ferrandiz. Predictive SPC 40
Traditional SPC
2/7/2014
A. Gilgur & J. Ferrandiz. Predictive SPC
41
HIGH
LOW
0
5
10
15
20
25
30
35
40
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91 94 97 100 103 106 109 112 115 118 121
R
0 50 100 150 200 250 300 350 400 450 500
1
3
5
7
9
11
13
15
17
19
21
23
25
27
29
31
33
35
37
39
0 0.5 1 1.5 2 2.5 3 3.5 4
1
3
5
7
9
11
13
15
17
Date
DailyAvg.
TPS
0
10
20
30
40
50
60
70
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91 94 97 100 103 106 109 112 115 118 121
HIGH
LOW
R
Date
Avg.CPU
utilization,%
Alternatives: Entropy-based approach
2/7/2014
A. Gilgur & J. Ferrandiz. Predictive SPC
42
(IN A
CLOSED
SYSTEM)
2/7/2014
A. Gilgur & J. Ferrandiz. Predictive SPC
43
Entropy and probabilities
Boltzmann’s Entropy:
Shannon’s Entropy:
∆ H < 0 → external
interaction detected
2/7/2014 44
Process Dynamics: A change in variance
Application DB
Server 1
Server 2
Server 3
Server N
Load
Balancer
Server [N+1]
Server [N+2]
Server [N + K]
Load Balancer:
• Has the load variance changed?
P-value
A. Gilgur & J. Ferrandiz. Predictive SPC
2/7/2014
A. Gilgur & J. Ferrandiz. Predictive SPC
45
Practical Application: Load Balancer
Application DB
Server 1
Server 2
Server 3
Server NLoad
Balancer
Server [N+1]
Server [N+2]
Server [N + K]
P-value
 Normal distribution
 Stationary processes
Xi: Server “i” got the transaction
wear and tear
from within
External
interaction
2/7/2014
A. Gilgur & J. Ferrandiz. Predictive SPC
46
Does External Interaction Only Lead to Entropy Reduction?
Western Electric Rules
2/7/2014
A. Gilgur & J. Ferrandiz. Predictive SPC
47
http://en.wikipedia.org/wiki/Western_Electric_rules
http://en.wikipedia.org/wiki/Nelson_rulesNelson Rules
Neither normal nor stationary
2/7/2014 A. Gilgur & J. Ferrandiz. Predictive SPC 48
Is it a Memory Leak (runaway process)?
Or is it expected behavior?
Why SPC?
• Because it’s cool
• Because it is logical
• Because we like to feel in control
• Because it saves $$
• Because we have the math all figured out
2/7/2014 A. Gilgur & J. Ferrandiz. Predictive SPC
49
P = # Processors
From X to Y to X
2/7/2014 A. Gilgur & J. Ferrandiz. Predictive SPC 50
2/7/2014 51
A black box
y = f (x)
x
The same black box
y = f (t)t y
y = f (x, t) + ε (t)
BM,Q,X,andRastimeseries
Q (BM, t) = X(BM, t) * R(BM, t)
q
x
r
BM
x = throughput (TPS)
r = response time
q = concurrency (traffic)
BM = business metric
Worker threads do we need to support business?
A. Gilgur & J. Ferrandiz. Predictive SPC
Multivariate Predictive SPC

CSP2014 Predictive SPC

  • 1.
    2/7/2014 A. Gilgur& J. Ferrandiz. Predictive SPC 1 Alex Gilgur, Josep Ferrandiz ASA Conference on Statistical Practice Tampa, FL. February 2014
  • 2.
    Outline • SPC =Statistical Process Control • The Fishbone of SPC • Traditional SPC • Six Sigma • Predictive SPC: – Univariate – Multivariate 2/7/2014 A. Gilgur & J. Ferrandiz. Predictive SPC 2
  • 3.
    2/7/2014 A. Gilgur &J. Ferrandiz. Predictive SPC 3 1968
  • 4.
    2/7/2014 A. Gilgur& J. Ferrandiz. Predictive SPC 4 •Artificial Intelligence •Predictive Analytics •Data Mining •Machine Learning
  • 5.
    2/7/2014 A. Gilgur &J. Ferrandiz. Predictive SPC 5 SPC Specifications Target What to measure? Science / Engineering / Math Domain Upper Spec Limit Lower Spec Limit Distribution Stationarity Process Dynamics Dependencies The Fishbone of SPC
  • 6.
    Setting Specs isan optimization problem 2/7/2014 6 p Servers = argmax (Revenue |Budget) Revenue = f[Throughput (Servers, SW, Budget)] Servers = argmin (Budget | Revenue) •Throughput = t (UX) •Revenue = r (Throughput) •Budget = f(SW, Servers) Constraints: •Domain •Budget ≤ B The business drives the specs A. Gilgur & J. Ferrandiz. Predictive SPC
  • 7.
    Specifications 2/7/2014 7 Stake- holders X Y LSL,Tgt, USLX = f-1 (Y) LSL, Tgt, USL From X to Y to X A. Gilgur & J. Ferrandiz. Predictive SPC
  • 8.
    Domain 2/7/2014 A. Gilgur &J. Ferrandiz. Predictive SPC 8
  • 9.
    “Knobs” to turn? 2/7/2014 A.Gilgur & J. Ferrandiz. Predictive SPC 9 Specifications Upper Spec Limit Target Lower Spec Limit Domain Science/Engineering/Math What to measure? Closing the loop Specs to adjust? Process Dynamics Stationarity Distribution Dependencies
  • 10.
    2/7/2014 A. Gilgur &J. Ferrandiz. Predictive SPC 10 Does External Interaction Only Lead to Improvement?
  • 11.
    2/7/2014 11 Process Dynamics:Short-term vs. Long-term Timeline Value(IO/sec,%Util,CPULoad,…) Day1 Day2 Day3 … … … … … … … … ... Day(N-1) Day(N) 0 0.5 1 1.5 2 2.5 3 3.5 4 1 3 5 7 9 11 13 15 17 HIGH LOW Data Collection hides the true distribution A. Gilgur & J. Ferrandiz. Predictive SPC
  • 12.
    2/7/2014 12 Process Dynamics:Traditional SPC μ == Target; μ + 6*σ <= USL; μ - 6*σ >= LSL; μ σ LCL UCL IDEAL CASE: on target; in control; within the specifications LSL USL Target A. Gilgur & J. Ferrandiz. Predictive SPC
  • 13.
    2/7/2014 13 μ σ LSL USL Target LCL UCL SHIFT μ!= Target; μ + 6*σ > USL; μ - 6*σ >= LSL; REAL CASE # 1: off target; in control; out of the specifications P-value Process Dynamics: a Shift 1) Negotiate specs 2) Change the process A. Gilgur & J. Ferrandiz. Predictive SPC
  • 14.
    2/7/2014 Process Dynamics: aChange in Variance 14 μ != Target; μ + 6*σ > USL; μ - 6*σ > LSL; REAL CASE #2: off target; out of control; out of specifications μ σ LSL USLTarget LCL UCL A. Gilgur & J. Ferrandiz. Predictive SPC P-value 1) Negotiate specs 2) Change the process
  • 15.
    2/7/2014 15 Bimodal example A.Gilgur & J. Ferrandiz. Predictive SPC
  • 16.
    • Cp –a measure of the process capability to produce consistent results: – Cp = (USL – LSL) / (6 * σ) – Desired Cp >= 1.0 – High Cp -> “In control” • Cpk – a measure of the process capability to produce results that are on target: – Cpk = Min { ( μ – LSL) / (3 * σ), (USL – μ) / (3 * σ)} – Desired Cpk >= 1.33 – High Cpk -> “In control and On Target” • Cpk > Cp > 1.33 -> “In Control, On Target” • Cpk < Cp < 1.0 -> “Out of Control, Off Target” 2/7/2014 16 SPC Measures A. Gilgur & J. Ferrandiz. Predictive SPC
  • 17.
    2/7/2014 17 SPC Measures: Anotherway to look at it • Z –measure of process capability to produce results within specs: – Zlower = (μ – LSL) / σ, – Zupper = (USL – μ) / σ. – Long-term and short-term Z: • Typically, desired Zst* = 6 • Zlt = Zst - 1.5 Zst == 2 : 310,000 defects per 1,000,000 opportunities (69%) Zst == 3 : 67,000 defects per 1,000,000 opportunities (93.3%) Zst == 6 : 3.45 defects per 1,000,000 opportunities (99.999965%) A. Gilgur & J. Ferrandiz. Predictive SPC Cpk= (1/3) * min (Zlower , Zupper)
  • 18.
    Statistical Process Control 2/7/201418 A. Gilgur & J. Ferrandiz. Predictive SPC • SPC = Statistical Process Control • The Fishbone of SPC • Traditional SPC • Six Sigma • Predictive SPC: – Univariate – Multivariate
  • 19.
    Six Sigma 2/7/2014 19 http://www.isixsigma.com/ 1.Define: i. End Goal ii. What to measure iii. How to measure 2. Measure: i. Gage R&R ii. Collect data 3. Analyze: i. Mean? Variance? Shape? All three? ii. Correlations 4. Improve: i. Design and Conduct Experiments ii. Analyze the results 5. Control: i. SPC ii. Education & Training iii. $avings A. Gilgur & J. Ferrandiz. Predictive SPC
  • 20.
    Statistical Process Control 2/7/2014A. Gilgur & J. Ferrandiz. Predictive SPC 20 • SPC = Statistical Process Control • The Fishbone of SPC • Traditional SPC • Six Sigma • Why is it not Good Enough? • Predictive SPC: – Univariate – Multivariate
  • 21.
    2/7/2014 A. Gilgur& J. Ferrandiz. Predictive SPC 21 Why is SPC / Six Sigma not good enough? Rapid growth New product introduction Agile Development R&D / Science … … … Zst == 2 : 310,000 defects per 1,000,000 opportunities (69%) Zst == 3 : 67,000 defects per 1,000,000 opportunities (93.3%) Zst == 6 : 3.45 defects per 1,000,000 opportunities (99.999965%) Manufacturing Mechanical / Chemical Semiconductor Food Pharmaceutical Power Plants Traffic Engineering Customer Support … … …
  • 22.
    2/7/2014 A. Gilgur& J. Ferrandiz. Predictive SPC 22 Why is SPC / Six Sigma not good enough?  Normal distribution  Stationary processes  Well defined LSL, Tgt, USL
  • 23.
    Something like this: 2/7/2014 A.Gilgur & J. Ferrandiz. Predictive SPC 23 Or this: A chemical process: •Known Specifications •Unknown Dynamics •Non-stationary A Data Center: •Unknown Specifications •Somewhat known Dynamics •Non-stationary
  • 24.
    Time1. Forecast themetric 2. An alternative: 1. Observe the process 2. Is the measurement within predicted interval? 2/7/2014 24 Holt-Winters ARIMA Univariate Predictive SPC … … … A. Gilgur & J. Ferrandiz. Predictive SPC
  • 25.
    Univariate Predictive SPC:what do we do? Time Time Outlier? •Do nothing Start of new pattern? •Reforecast •The clock: •Continue from the beginning? •Reset at reforecast point? When do we decide? Process Dynamics Stationarity Distribution Dependencies
  • 26.
    A Business Model 2/7/2014 A.Gilgur & J. Ferrandiz. Predictive SPC 26 Application DB Makes sense to use parabola … Traffic = # of units of work (worker threads in use) in the system Little’s Law: Traffic = Arrival_rate * Processing_time Arrival_Rate = a * BMI + b Processing_time = c * Arrival_rate + d Traffic = (a * BMI + b) * [c * (a * BMI + b) + d] Traffic = f * BMI2 + g * BMI + h Multivariate Predictive SPC x = throughput r = response time q = # of worker threads BM = business metric
  • 27.
    Quadratic fit… 2/7/2014 A. Gilgur& J. Ferrandiz. Predictive SPC 27 Application DB Check the residuals KPM ~ BMI2 + BMI BMI (MCPS = millions clicks per second) KPM(Concurrency=UnitsofWorkinthesystem)
  • 28.
    2/7/2014 A. Gilgur& J. Ferrandiz. Predictive SPC 28 Application DB The curve fit is a-OK Outliers Patterns: Top != Bulk != Bottom Residuals…
  • 29.
    What’s SPC gotto do with it? 2/7/2014 A. Gilgur & J. Ferrandiz. Predictive SPC 29 Application DB 80 MCPS -> {290… 480} 100 MCPS -> {380… 600} Quantile regression: • Independent top, bulk, bottom • A range of KPM for each slice of BM • Robust to outliers
  • 30.
    80 MCPS ->{290… 480} 100 MCPS -> {380 … 600} New data arrived 2/7/2014 A. Gilgur & J. Ferrandiz. Predictive SPC 30 Application DB 80 MCPS -> {420… 700} 100 MCPS -> {540… 800}
  • 31.
    Packets ~ qps Packets~ qps Another Example: How can we use this as a method? 2/7/2014 A. Gilgur & J. Ferrandiz. Predictive SPC 31 Baseline: @ 100 qps: M = 65 mln concurrent packets R = (30…110) mln concurrent packets New Data: @ 100 qps: M = 63 mln concurrent packets R = (30…80) mln concurrent packets Target (LCL…UCL) (LSL…USL)
  • 32.
    “Knobs” to turn? Specifications Domain Specsto adjust? Process Dynamics Multivariate Predictive SPC: the general idea 2/7/2014 A. Gilgur & J. Ferrandiz. Predictive SPC 32 Data are NOT stationary Data are NOT normal “Processes behind data” are understood A general idea of “good” vs. “bad” behavior oNO specifications is OK Traffic = f * BMI2 + g * BMI + h @BMI values of interest  “Processes behind data” CAN BE DESCRIBED MATHEMATICALLY  Closed-form Monte-Carlo
  • 33.
    How can weuse it? 2/7/2014 A. Gilgur & J. Ferrandiz. Predictive SPC 33 Direct KPM-BMI link ID degraded apps “Knobs” to turn? Specifications Domain Specs to adjust? Process Dynamics
  • 34.
    2/7/2014 A. Gilgur& J. Ferrandiz. Predictive SPC 34 •Artificial Intelligence •Predictive Analytics •Data Mining •Machine Learning1968 2014 Predictive SPC
  • 35.
    So, how elsecan we use this as a method? 2/7/2014 A. Gilgur & J. Ferrandiz. Predictive SPC 35 Business Metric # Work Units # Servers
  • 36.
    How else canwe use this as a method? 2/7/2014 A. Gilgur & J. Ferrandiz. Predictive SPC 36 p Universal Scalability Law # Servers KPM Specs Ballast
  • 37.
    Summary • SPC Loop(s): –The Why – The What – The How • Six Sigma: – DMAIC • Next Generation: – Predictive SPC • Univariate Predictive SPC: Forecasting • Multivariate Predictive SPC: Relationships among variables – A way to stay ahead of the curve – Domain specific – Process agnostic – Expandable, Flexible, and Robust – An extension of Traditional SPC – Successfully implemented in IT 2/7/2014 A. Gilgur & J. Ferrandiz. Predictive SPC 37 “Knobs” to turn? Specifications Domain Specs to adjust? Process Dynamics
  • 38.
    Thank you! 2/7/2014 A.Gilgur & J. Ferrandiz. Predictive SPC 38 www.isixsigma.com www.amstat.org www.cmg.org www.linkedin.com http://alexonsimanddata.blogspot.com/ http://josepferrandiz.blogspot.com/ “Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and write.” - H.G.Wells (1866-1946)
  • 39.
    THANK YOU 2/7/2014 A.Gilgur & J. Ferrandiz. Predictive SPC 39
  • 40.
    Appendix • BACKUP SLIDES 2/7/2014A. Gilgur & J. Ferrandiz. Predictive SPC 40
  • 41.
    Traditional SPC 2/7/2014 A. Gilgur& J. Ferrandiz. Predictive SPC 41 HIGH LOW 0 5 10 15 20 25 30 35 40 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91 94 97 100 103 106 109 112 115 118 121 R 0 50 100 150 200 250 300 350 400 450 500 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 0 0.5 1 1.5 2 2.5 3 3.5 4 1 3 5 7 9 11 13 15 17 Date DailyAvg. TPS 0 10 20 30 40 50 60 70 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91 94 97 100 103 106 109 112 115 118 121 HIGH LOW R Date Avg.CPU utilization,%
  • 42.
    Alternatives: Entropy-based approach 2/7/2014 A.Gilgur & J. Ferrandiz. Predictive SPC 42 (IN A CLOSED SYSTEM)
  • 43.
    2/7/2014 A. Gilgur &J. Ferrandiz. Predictive SPC 43 Entropy and probabilities Boltzmann’s Entropy: Shannon’s Entropy: ∆ H < 0 → external interaction detected
  • 44.
    2/7/2014 44 Process Dynamics:A change in variance Application DB Server 1 Server 2 Server 3 Server N Load Balancer Server [N+1] Server [N+2] Server [N + K] Load Balancer: • Has the load variance changed? P-value A. Gilgur & J. Ferrandiz. Predictive SPC
  • 45.
    2/7/2014 A. Gilgur &J. Ferrandiz. Predictive SPC 45 Practical Application: Load Balancer Application DB Server 1 Server 2 Server 3 Server NLoad Balancer Server [N+1] Server [N+2] Server [N + K] P-value  Normal distribution  Stationary processes Xi: Server “i” got the transaction wear and tear from within External interaction
  • 46.
    2/7/2014 A. Gilgur &J. Ferrandiz. Predictive SPC 46 Does External Interaction Only Lead to Entropy Reduction?
  • 47.
    Western Electric Rules 2/7/2014 A.Gilgur & J. Ferrandiz. Predictive SPC 47 http://en.wikipedia.org/wiki/Western_Electric_rules http://en.wikipedia.org/wiki/Nelson_rulesNelson Rules
  • 48.
    Neither normal norstationary 2/7/2014 A. Gilgur & J. Ferrandiz. Predictive SPC 48 Is it a Memory Leak (runaway process)? Or is it expected behavior?
  • 49.
    Why SPC? • Becauseit’s cool • Because it is logical • Because we like to feel in control • Because it saves $$ • Because we have the math all figured out 2/7/2014 A. Gilgur & J. Ferrandiz. Predictive SPC 49
  • 50.
    P = #Processors From X to Y to X 2/7/2014 A. Gilgur & J. Ferrandiz. Predictive SPC 50
  • 51.
    2/7/2014 51 A blackbox y = f (x) x The same black box y = f (t)t y y = f (x, t) + ε (t) BM,Q,X,andRastimeseries Q (BM, t) = X(BM, t) * R(BM, t) q x r BM x = throughput (TPS) r = response time q = concurrency (traffic) BM = business metric Worker threads do we need to support business? A. Gilgur & J. Ferrandiz. Predictive SPC Multivariate Predictive SPC