3. Objectives
Objectives
ƒ Introduce the concepts of Stability and Variability
ƒ Introduce the concepts of Centering and Variation of
data
ƒ Discuss the different types of data
ƒ Introduce measures of Centering and Variability
ƒ Demonstrate the calculation of Mean and Standard
Deviation
ƒ Introduce basic Minitab functions
4. Basic Statistics
Basic Statistics
Fundamentals of Improvement
Fundamentals of Improvement
ƒ Stability
– How does the process perform over time?
→Stability is represented by a constant mean and
predictable variability over time.
25
20
1
5
1
0
5
0
75
70
65
Sam
ple Number
Sample
M
ean
X-Bar Chart for Process A
X=70.91
UCL=77.20
LCL=64.62
25
20
1
5
1
0
5
0
80
70
60
50
Sam
ple Number
Sample
M
ean
X-Bar Chart for Process B
X=70.98
UCL=77.27
LCL=64.70
Which process is the better process?
ƒ Variability
– Is the process on target with minimum variability?
→The mean is used to determine if process is on target. The
Standard Deviation (σ) is used to determine variability
5. Variation
Variation
• “While every process displays Variation, some processes
display controlled variation
controlled variation, while other processes
display uncontrolled variation
uncontrolled variation” (Walter Shewhart)
• Controlled Variation is characterized by a stable and
consistent
consistent pattern of variation over time
– Associated with Common Causes
• Uncontrolled Variation is characterized by variation that
changes
changes over time
– Associated with Special Causes
6. Variation Examples
Variation Examples
25
20
1
5
1
0
5
0
75
70
65
Sample Number
Sample
Mean
X-Bar Chart for Process A
X=70.91
UCL=77.20
LCL=64.62
25
20
1
5
1
0
5
0
75
70
65
Sample Number
Sample
Mean
X-Bar Chart for Process A
X=70.91
UCL=77.20
LCL=64.62
25
20
1
5
1
0
5
0
80
70
60
50
Sample Number
Sample
Mean
X-Bar Chart for Process B
X=70.98
UCL=77.27
LCL=64.70
Special Causes
Special Causes
Process A shows controlled variation
Process B shows uncontrolled variation
7. Can Variability Be Tolerated?
Can Variability Be Tolerated?
ƒ There will always be variability present in any
process
Target
Target
LSL
LSL
USL
USL
Traditional View
Traditional View
Acceptable
Acceptable
ƒ Variability can be tolerated if:
ƒ The process is on target
ƒ The total variability is relatively small compared to the
process specifications
ƒ The process is stable over time
8. The New View of Variability
The New View of Variability
ƒ Performance suffers when the process deviates from the target
Target
Target
LSL
LSL
USL
USL
Loss Function
Loss Function
Cost
ƒ A Loss Function describes the cost associated with deviation
from the target value
ƒ Costs increase with variability, and are not just associated with
performance outside of the specification limits
Source: Ranjit Roy: A Primer on the Taguchi Method
11. LAL UAL
TAR
Sony TV Color Density Distribution
Sony TV Color Density Distribution
Barely Acceptable Variability
Barely Acceptable Variability
LAL = Lower Allowable Limit
UAL = Upper Allowable Limit
TAR = Target Value
Source: Ranjit Roy: A Primer
on the Taguchi Method
14. Cost
Yet we are ready to use one and throw the other out !
Yet we are ready to use one and throw the other out !
X
X
Is this value….
X
X
Really different
than this one?
LSL
LSL
USL
USL
Target
Target
Target Versus
Target Versus “
“In Spec.
In Spec.”
”
15. ƒ Determine if process is stable
• If process is not stable, identify and remove causes of
instability
Data Analysis Tasks for Improvement
Data Analysis Tasks for Improvement
ƒ Determine the location of the process mean
ƒ Is it on target?
• If not, identify the variables which affect the mean and
determine optimal settings to achieve target value
ƒ Estimate the magnitude of the total variability
ƒ Is it acceptable with respect to the customer requirements
(spec limits)?
• If not, identify the sources of the variability and eliminate or
reduce their influence on the process
We will now review statistics that help this process
16. Basic Statistics
Basic Statistics
ƒ Types of Data
ƒ Measures of the Center of the Data
– Mean
– Median
ƒ Measures of the Spread of Data
– Range
– Variance
– Standard Deviation
17. Types of Data
Types of Data
Attribute Data (Discrete) (Qualitative)
Attribute Data (Discrete) (Qualitative)
ƒ Categories
• Machine 1, Machine 2, Machine 3
• Shift number
ƒ Counted things
• Attribute Type 1 – Placing Items into a Category (#good, # bad)
• Attribute Type 2 – Counting Discrete Events (# scratches on coil)
18. Types of Data
Types of Data
Variable Data (Continuous) (Quantitative)
Quantitative)
ƒ Continuous Data (Decimal subdivisions are
meaningful)
• Time (seconds)
• Pressure (psi)
• Conveyor Speed (ft/min)
• Rate (inches/min)
• etc.
19. Selecting Statistical Techniques
Selecting Statistical Techniques
Attribute Variable
Variable
Attribute
Outputs
Inputs
Chi-square Analysis of Variance
Logistic Regression
Correlation
Multiple Regression
There are different statistical techniques to
cover all combinations of data types
There are different statistical techniques to
cover all combinations of data types
Y
X
20. Measures of Central Tendency
Measures of Central Tendency
ƒ Mean: Arithmetic average of a set of values
ƒ Reflects the influence of all values
ƒ Strongly influenced by extreme values
n
x
x
n
n
n
∑
=
= 1
21. ƒ Median: Reflects the 50% rank - the center
number in a sorted set of numbers
ƒ Does not necessarily include all values in
calculation
ƒ Is “robust” to extreme scores
Measures of Central Tendency
Measures of Central Tendency
Why would we mainly use the mean, instead of
the median, in process improvement efforts?
Why would we mainly use the mean, instead of
the median, in process improvement efforts?
22. $10, 20, 30, 40, 50 ($ in thousands)
$10, 20, 30, 40, 50 ($ in thousands)
As head of the University’s Communications Dept.
you are asked to summarize the average starting
salaries of Communications graduates.
What is the median
income?
Example
Example
What is the mean income
(or “center of gravity”)?
23. $10, 20, 30, 40, 5,000 ($ in thousands)
$10, 20, 30, 40, 5,000 ($ in thousands)
What is the median
income?
Example
Example
What is the mean income
(or “center of gravity”)?
However, under the advice of the Public Relations Dept. you
consider including one of your former Communications
majors: Shaquille O’Neal (a rather wealthy basketball star)
24. Measures of Variability
Measures of Variability
ƒ Range: The distance between the extreme values of
a data set (Highest - Lowest)
ƒ Variance (σ
σ2
2 ): The Average Squared Deviation of
each data point from the Mean
ƒ Standard Deviation (σ
σ ): The Square Root of the
Variance
– The range is more sensitive to outliers than the
variance
The most common and useful measure of
variation is the standard deviation - why?
The most common and useful measure of
The most common and useful measure of
variation is the standard deviation
variation is the standard deviation -
- why?
why?
26. Calculating
Calculating
Sigma
Sigma
Problem: Using the form above, calculate
the standard deviation for the numbers:
2 1 3 5 4
1
-
n
)
(X
n
1
=
i
2
i
∑ − X
1
-
n
)
(X
n
1
=
i
2
i
∑ − X
i (X-X)
X-X 2
X
1
2
3
4
5
6
7
8
9
10
Σ
Mean
s-square
s
27. 1 2 -1 1
2 1 -2 4
3 3 0 0
4 5 2 4
5 4 1 1
6
7
8
9
10
Σ 15 10
Mean 3
s-square 2.5
s 1.581139
Example 1
Example 1
i X-X (X-X)2
X
( )
2
1
1
−
−
∑
=
n
X
X
n
i
i
( )
2
1
1
−
−
∑
=
n
X
X
n
i
i
i
28. 1 1
2 49
3 50
4 51
5 99
Σ 250
Mean 50
s-square
s
Example 2a
Example 2a
( )
2
1
1
−
−
=
∑
=
n
X
X
S
n
i
i
i X-X (X-X)2
X
29. 1 1
2 2
3 50
4 98
5 99
Σ 250
Mean 50
s-square
s
Example 2b
Example 2b
( )
2
1
1
−
−
=
∑
=
n
X
X
S
n
i
i
i X-X (X-X)2
X
30. 1 1 -49 2401
2 49 -1 1
3 50 0 0
4 51 1 1
5 99 49 2401
Σ 250 4804
Mean 50
s-square 1201
s 34.65545
Example 2a Solution
Example 2a Solution
( )
2
1
1
−
−
=
∑
=
n
X
X
S
n
i
i
i X-X (X-X)2
X
31. 1 1 -49 2401
2 2 -48 2304
3 50 0 0
4 98 48 2304
5 99 49 2401
Σ 250 9410
Mean 50
s-square 2352.5
s 48.50258
Example 2b Solution
Example 2b Solution
i X-X (X-X)2
X
( )
2
1
1
−
−
=
∑
=
n
X
X
S
n
i
i
32. Minitab Background
Minitab Background
ƒ Minitab was first introduced at Penn State in the late 70’s
ƒ Started as a DOS based program and migrated to Windows
ƒ Heavily used in the academic world
ƒ Frequently used in training
ƒ Used at many 6 Sigma companies (GE, AlliedSignal, Motorola)
ƒ User friendly - especially for beginning students
33. Main Screen
Main Screen
Data Window:
• A Worksheet, not an Excel Spreadsheet
• Column names are above first row
• Everything in a column is considered to
be from the same group
Data Window:
• A Worksheet, not an Excel Spreadsheet
• Column names are above first row
• Everything in a column is considered to
be from the same group
Session Window:
• The Output
Session Window:
• The Output
35. Data Window
Data Window
ƒ Enter Data into Minitab by
ƒ Typing it in
ƒ Cutting & pasting from other programs
ƒ Random number generators in Minitab
ƒ Importing it
ƒ Excel, Text, ASCII, Dbase files, etc….
43. 25.6
24.8
24.0
23.2
Median
Mean
25.4
25.2
25.0
24.8
24.6
24.4
A nderson-Darling Normality Test
V ariance 0.756
Skew ness -0.339296
Kurtosis -0.972667
N 30
Minimum 23.319
A -Squared
1st Q uartile 24.073
Median 25.065
3rd Q uartile 25.461
Maximum 26.058
95% C onfidence Interv al for Mean
24.524
0.56
25.173
95% C onfidence Interv al for Median
24.349 25.320
95% C onfidence Interv al for StDev
0.692 1.169
P-V alue 0.134
Mean 24.848
StDev 0.869
95% Confidence Intervals
Summary for Bob
Minitab Output
Minitab Output
Mean
Standard
Deviation
Min Value
Max Value
Histogram
45. Click OK on this screen
and on the previous
screen and the pie chart
will be created.
Graph - Pie Chart
Check boxes
10.0%
other
41.0%
Rep
49.0%
Dem
Category
Dem
Rep
other
Pie Chart of Pct vs Party
47. Medical Data Example
Medical Data Example
Falls
Frequency
7
6
5
4
3
2
1
0
7
6
5
4
3
2
1
0
Mean 3.88
StDev 1.536
N 25
Histogram of Falls
Normal
48. Medical Data Example
Medical Data Example
6
5
4
3
2
1
0
Median
Mean
5.0
4.5
4.0
3.5
3.0
A nderson-D arling N ormality Test
V ariance 2.3600
S kew ness -0.531741
Kurtosis 0.118316
N 25
M inimum 0.0000
A -S quared
1st Q uartile 3.0000
M edian 4.0000
3rd Q uartile 5.0000
M aximum 6.0000
95% C onfidence Interv al for M ean
3.2459
0.61
4.5141
95% C onfidence Interv al for M edian
3.0000 5.0000
95% C onfidence Interv al for S tD ev
1.1995 2.1371
P -V alue 0.099
M ean 3.8800
S tD ev 1.5362
95% Confidence Intervals
Summary for Falls
49. Medical Data Example
Medical Data Example
Observation
Falls
24
22
20
18
16
14
12
10
8
6
4
2
6
5
4
3
2
1
0
Number of runs about median:
0.97786
8
Expected number of runs: 12.52000
Longest run about median: 7
A pprox P-Value for C lustering: 0.02214
A pprox P-Value for Mixtures:
Number of runs up or dow n:
0.37132
17
Expected number of runs: 16.33333
Longest run up or down: 3
A pprox P-Value for Trends: 0.62868
A pprox P-Value for O scillation:
Run Chart of Falls
Stat>Quality Tools>Run Chart
51. Medical Data Example
Medical Data Example
Sample
Sample
Count
Per
Unit
24
22
20
18
16
14
12
10
8
6
4
2
7
6
5
4
3
2
1
0
_
U=2.205
UCL=5.354
LCL=0
U Chart of Falls
Tests performed with unequal sample sizes
52. Medical Data Example
Medical Data Example
References
References
• From a Web Article by Thomas Pyzdek who is a
consultant in Six Sigma. Visit his Web site at
pyzdek.com. E-mail him at tpyzdek@hotmail.com
• http://www.isixsigma.com/offsite.asp?A=Fr&Url=http
://www.qualitydigest.com/may99/html/spcguide.html
53. Summary
Summary
ƒ Where is your process centered?
ƒ How is centering measured?
ƒ There is variation in all things.
ƒ Does your process have excess variation?
ƒ Is your process stable & predictable?
ƒ How is variation measured?
ƒ Introduced Minitab functions for basic descriptive
statistics and graphics presentation of data