Big Data &
The Trouble with ‘Normal’
Common Pitfalls in Capability/Performance Analysis
Barry Khor
barrykhor@gmail.com
All rights reserved
This document was developed with knowledge sharing in mind. Distribution and reproduction of this document, in part
or in whole is freely encouraged provided authorship information is preserved.
1 –
“So let us not talk falsely now, the hour is getting late”
Bob Dylan
All along the watchtower

Preface
The author wishes to express that the underlying statistical techniques employed in these studies
are very basic and they are not the focus of this document. The focus is instead on the
interpretation of data and the care and discipline in examining the data for problem solving and
continuous improvement.
Secondly, this document is a work in progress. As such the author welcomes comments and
critique.
In whatever way the document may evolve over time it is the author’s hope that some of the
readers may derive benefit from these studies in its current state at any time.
Barry Khor
barrykhor@gmail.com
October 2017

“What are we waiting for, Christmas?”
“Things are not what they seem…."
CM Achuthan, author’s ex-boss/mentor@ Hitachi Penang
“Trust, but verify.”
Dr. Irwin M Jacobs, Founder/CEO, Qualcomm
Yogi-ism:
“When you come to a fork in the road, take it"
“You can observe a lot just by watching”
“No one goes there nowadays, it is too crowded"
Yogi Berra
“It’s not that I’m so smart, it’s just that I stay with problems longer"
“In the middle of difficulty lies opportunity”
“Everybody is a genius. But if you judge a fish by its ability to climb a tree, it will live its whole
life believing that it is stupid“
Albert Einstein
“There Is no such thing as an Electrical Failure”
Failure Analysis 101
“The hardest to learn is the least complicated”
Indigo Girls
Favorite quotes (some related to the subject matter, others not).

– 4 –
Common Pitfalls to Avoid
in Statistical Capability/Performance Analysis
March 2011
Barry Khor
Case Study 1 – with engineered data. Data points
were ‘made up’ using NORMDIST function in Excel.

Let’s pretend this is a distribution of a measurement with a targeted mean
value of 50. The distribution appear to be very normal with perfect bell shape
and good symmetry around the mean value of 50. Most would stop right here
and declare the process “robust”, and happily sets the Lower and Upper
control limits at 4 sigma or 5 sigma for SPC and call it a day.
What’s wrong with that?

The trouble with Normal is that it is not Normal.
The perfect looking bell shape distribution (“overall”) is actually the
summation of 2 subgroups with very different mean values, but similar
standard deviation. This kind of composite distribution is quite common in
high volume production involving many machines with settings which can
change over time.
“Something is rotten in the state of Denmark…”
William Shakespeare, Hamlet.

Armed with the knowledge of the two subgroups/
offset of the means, the process owner can target a
new setting to bring both groups together around
the targeted mean of 50. Mean shifts are generally
easier to correct because the shift is “translational”
whereas variations around the mean are usually
harder to minimize due to their random noise-
sensitive nature.
By centering the two subgroups around the targeted
mean of 50, the overall distribution has a narrower
distribution and higher population around the mean,
i.e. “better central tendencies” which usually
translate into better product performance.
.

Comparing the distribution before and after the process centering of the sub-groups: The new distribution is
significantly improved . The benefits are two fold:
1. Less probability of rejection at the fringes (tails)
2. More units with ideal performance.
This realization of benefits would have been lost had the wrong interpretation persisted by just looking at the
so-called BIG DATA. Always slice and dice to the extent allowable. Sometimes it may be necessary to stratify
using non-existent grouping such as sequences, odd vs. even entries etc.
After
Before

– 9 –
A Case Study –
Solder Height Data Analysis
March 2011
Barry Khor
“So let us not talk falsely now, the hour is getting late”
Bob Dylan
All along the watchtower
Case Study 2 – with real but sanitized data. Data
source is kept anonymous for it is really irrelevant
here for the purpose of this study.

– 10 –
Looks pretty normal, right?
Be suspicious…….. be very suspicious!
Let’s take a closer look.
Histogram from production data data supplied by a SMT Contract manufacturer of an
OEM, in support of their claim that irregularity in an IC component is responsible for bad
yields attributed to both insufficient solder and solder bridging. This data represents
10,784 paste print height measurements from each PCB pad, on 16 panels of 2 boards
each. 337 pads per board. While not unprecedented, it can be considered as BIG DATA.

– 11 –
To the trained eye, there are clues that this data needs further
analysis
1. The abrupt cutoff at the left tail of the distribution suggests
some kind of screening or exclusion of data points. Excluding
data within the natural distribution , for the purpose of
capability analysis is not a good practice . It defeats the
purpose of the study.
2. The distribution is somewhat skewed to the right : right side
of the mode (4.2 to 4.3 bin) has more data points than the left,
even with the left tail accounted for..
What the process owner might say::
Both the cutoff and the skewness is a natural response since there is a
minimum thickness that is pre-ordained by the stencil’s thickness.

This is where things can become dangerous when the process owner has only a very rudimentary understanding of the
concept of Capability Analysis, and Performance analysis. A MiniTab report somehow legitimizes the proclamation that the
process is well in control with a Cpk of 1.57. The correct message from this chart is that the performance of the process (as
indicated by Ppk = 0.66) is much below the intrinsic capability or potential Cpk (1.57 in this case). The potential distribution is
a great Minitab feature that unfortunately can easily lead to a false sense of security if wrongly interpreted.
– 12 –
7.156.606.055.504.954.403.853.30
LSL U SL
LSL 3.5
Target *
U SL 7.5
Sam ple M ean 4.68858
Sam ple N 10784
StD ev(W ithin) 0.252208
StD ev(O verall) 0.604139
ProcessD ata
C p 2.64
C PL 1.57
C PU 3.72
C pk 1.57
Pp 1.10
PPL 0.66
PPU 1.55
Ppk 0.66
C pm *
O verallC apability
Potential(W ithin)C apability
PPM < LSL 0.00
PPM > U SL 0.00
PPM Total 0.00
O bserved Perform ance
PPM < LSL 1.22
PPM > U SL 0.00
PPM Total 1.22
Exp.W ithin Perform ance
PPM < LSL 24569.27
PPM > U SL 1.63
PPM Total 24570.90
Exp.O verallPerform ance
W ithin
O verall
Process C apability of H eight
The EMS claimed a robust process with a Cpk of 1.57, well in excess of generally acceptable Cpk of 1.33, with
this following Minitab generated Capability Summary

What is wrong with that preceding assessment?
The EMS claim of acceptable Cpk was based on the
potential capability but actual distribution is much
worse (Ppk=0.66)
Even though the original claim is dismissed there is still
an underlying issue- what caused the actual
performance to be much worse than the potential
performance? Fortunately there is enough intelligence
in the raw data to help determine the root cause.
Following pages explain the root cause for the
underperformance.
– 13 –
7.156.606.055.504.954.403.853.30
LSL U SL
LSL 3.5
Target *
U SL 7.5
Sam ple M ean 4.68858
Sam ple N 10784
StD ev(W ithin) 0.252208
StD ev(O verall) 0.604139
ProcessD ata
C p 2.64
C PL 1.57
C PU 3.72
C pk 1.57
Pp 1.10
PPL 0.66
PPU 1.55
Ppk 0.66
C pm *
O verallC apability
Potential(W ithin)C apability
PPM < LSL 0.00
PPM > U SL 0.00
PPM Total 0.00
O bserved Perform ance
PPM < LSL 1.22
PPM > U SL 0.00
PPM Total 1.22
Exp.W ithin Perform ance
PPM < LSL 24569.27
PPM > U SL 1.63
PPM Total 24570.90
Exp.O verallPerform ance
W ithin
O verall
Process C apability of H eight
MiniTAB Definition
Within and overall refer to different ways of estimating process variation. A within estimate, such as Rbar/d2, is based on variation within
subgroups. The overall estimate is the overall standard deviation for the entire study. Cp and Cpk are listed under Potential (Within)
Capability because they are calculated using the within estimate of variation. Pp and Ppk are listed under Overall Capability because they
are calculated using the overall standard deviation of the study.
The within variation corresponds to the inherent process variation defined in the Statistical Process Control (SPC) Reference Manual
(Chrysler Corporation, Ford Motor Company, and General Motors Corporation. Copyright by A.I.A.G) while overall variation corresponds
to the total process variation. Inherent process variation is due to common causes only. Overall variation is due to both common and
special causes. Cp and Cpk are called potential capability in Minitab, because they reflect the potential that could be attained if all special
causes were eliminated.
Big hint here: A within estimate, such as Rbar/d2, is based on variation within subgroups.

Process Capability Analysis (JMP)
– 14 –
Using current limits, Ppk = 0.67 with rejection rate of 2.5%
At2.5% pad rejection, board level yield is essentially 0%
Using author’s suggested limits, Ppk = 0.18 with rejection rate of 43%!

Could someone translate that?
– 15 –
Let’s give it a shot:
1. The actual process performance is indicated by the Ppk index. Cpk as defined using the Mini-TAB analysis should be
taken to mean the potential capability IF special causes has no significant contribution to the variation here.
Examination of the data indicates that special causes cannot be ignored here. The actual process performance is not
satisfactory even without considering the limits. In a well qualified and controlled process the Ppk should be close to
the intrinsic Cpk
2. Since the big hint has to do with subgroup, the data is then stratified into the subgroups using the board design
information (# of pads per panel) and a trend emerged. Following slides show large variation or shifts between
successive panel # and board number within the panel – this special cause alone is the largest contributor to the
overall wide distribution. It indicates variation between forward and reverse strokes of the squeegee/print head, and
uneven setting (squeegee pressure) on each board.
3. The overall spec limits of .0035” to 0.0075” are too loose or lenient in the author’s opinion, for a 0.004” or 100 um
stencil. The appropriate limits should be 0.004” to 0.005." Using these new limits the process capability became very
low. The conclusion is the paste print process is not stable and it is probably the dominating reason for the poor
soldering yields. It was recommended that the process be re-qualified after adjusting for the said differences between
the two squeegees in the print head.
4. The low paste thickness could contribute to poor soldering not just because of the lower solder volume but the lower
amount of flux available for optimal solder reflow.

Unleashing the power of data stratification, in this case the overall histogram is “SPLIT” 16 ways into
panel number, arranged in numeric sequence (Down then Right Top and so on).
– 16 –
The unmistakable alternating trend led to the conclusion that the screen printer’s squeegee was not set up right
leading to the wide variation in squeegee pressure between forward and backward stroke. Upon presenting this data
the complaint was immediately dropped.
BINGO!

– 17 –
The End
Questions? Comments?
Barry Khor
barrykhor@gmail.com

Recommended

Recommended

More Related Content

Recently uploaded

Recently uploaded (20)