1. UNCLASSIFIED / FOUO
National Guard
Black Belt Training
Module 20
Data Collection
UNCLASSIFIED / FOUO
2. UNCLASSIFIED / FOUO
CPI Roadmap – Measure
8-STEP PROCESS
6. See
1.Validate 2. Identify 3. Set 4. Determine 5. Develop 7. Confirm 8. Standardize
Counter-
the Performance Improvement Root Counter- Results Successful
Measures
Problem Gaps Targets Cause Measures & Process Processes
Through
Define Measure Analyze Improve Control
TOOLS
•Process Mapping
ACTIVITIES
• Map Current Process / Go & See •Process Cycle Efficiency/TOC
• Identify Key Input, Process, Output Metrics •Little’s Law
• Develop Operational Definitions •Operational Definitions
• Develop Data Collection Plan •Data Collection Plan
• Validate Measurement System •Statistical Sampling
• Collect Baseline Data •Measurement System Analysis
• Identify Performance Gaps •TPM
• Estimate Financial/Operational Benefits •Generic Pull
• Determine Process Stability/Capability •Setup Reduction
• Complete Measure Tollgate •Control Charts
•Histograms
•Constraint Identification
•Process Capability
Note: Activities and tools vary by project. Lists provided here are not necessarily all-inclusive. UNCLASSIFIED / FOUO
3. UNCLASSIFIED / FOUO
Learning Objectives
Determine what to measure and why
Prepare plans to collect output, process and/or input
data
Apply sampling techniques, as needed
Construct forms and test data collection procedures
Refine data collection
Implement data collection plan
UNCLASSIFIED / FOUO 3
4. UNCLASSIFIED / FOUO
What Is a Measure?
A quantified evaluation of characteristics
and/or level of performance based on
observable data
Examples include:
Length of time (speed, age)
Size (length, height, weight)
Dollars (costs, sales revenue, profits)
Counts of characteristics or “attributes” (types of
customer, property size, gender)
Counts of defects (number of errors, late checkouts,
complaints)
UNCLASSIFIED / FOUO 4
5. UNCLASSIFIED / FOUO
Why Measure?
Establish the current performance level (baseline)
Determine priorities for action – and whether or not
to take action
Substantiate the magnitude of the problem
To gain insight into potential causes of problems and
changes in the process
Prevent problems and predict future performance
To gain knowledge about the problem,
process, customer or organization
UNCLASSIFIED / FOUO 5
6. UNCLASSIFIED / FOUO
National Guard
Black Belt Training
Determine
What to Measure
UNCLASSIFIED / FOUO
7. UNCLASSIFIED / FOUO
What Do We Need to Know?
The first step in the creation of any data collection
plan is to decide what you need to know about your
process and where to find measurement points
What data is needed to “baseline” our problem?
What “upstream” factors might affect the
process/problem?
What do we plan to do with the data after it has been
gathered?
Do we have a balance between Output and
Input/Process measures?
UNCLASSIFIED / FOUO 7
8. UNCLASSIFIED / FOUO
Deciding “What and Where”
Process
Input Output
Preparing the SIPOC diagram and a more detailed process
map can help a team select its measures
Choosing good measures requires a clear understanding of the
definitions of and relationships between Output, Process, and
Input measures
UNCLASSIFIED / FOUO 8
9. UNCLASSIFIED / FOUO
“X” and “Y” Variables
Y = f ( X1 + X2 + X3 + . . . . . . . . . Xn )
Output Input/Process
Final Score in First Second Third Fourth
= + + + + Overtime
Basketball Quarter Quarter Quarter Quarter
Game Score
Score Score Score Score
Customer = Front Desk Check In Room Room Check Out
+ + + +
Satisfaction Courtesy Ease Comfort Service Ease
Loan Process Application Credit & Risk Review & Loan Service
Cycle Time = Data Entry + Collateral + Assessment +
Approval Time
+
Time
Time Check Time Time
Generally, you can influence some of the Xs but not all. CPI
projects will generally address those Xs which can be influenced
and which have the greatest impact on Y.
UNCLASSIFIED / FOUO 9
10. UNCLASSIFIED / FOUO
Measuring Business Processes
X - PREDICTOR Y - RESULTS
(Leading) MEASURES (Lagging) MEASURES
(X) (X) (Y)
Input Process Output
• Arrival Time • Customer
• Accuracy Satisfaction
• Cost • Total
Defects
• Key Specs
• Cycle Time
• Cost Profit
Time Per Task
How well do these (Xs)… In-Process Errors …predict this (Y)?
Labor Hours
Exceptions
UNCLASSIFIED / FOUO 10
11. UNCLASSIFIED / FOUO
Categories of Performance Metrics
Developing Input, Process and Output metrics around the Voice of the
Customer (VOC) and Voice of the Business (VOB) process performance
needs is a good starting point for determining what to measure
Product or Service Features, Attributes, Dimensions, Characteristics
Relating to the Function of the Product or Service, Reliability, Availability,
Quality Taste, Effectiveness - Also Freedom from Defects, Rework or Scrap
(Derived Primarily from the Customer - VOC)
Process Cost Efficiency, Prices to Consumer (Initial Plus Life Cycle), Repair
Cost Costs, Purchase Price, Financing Terms, Depreciation, Residual Value
(Derived Primarily from the Business - VOB)
Lead Times, Delivery Times, Turnaround Times, Setup Times, Cycle
Speed Times, Delays (Derived equally from the Customer or the Business
– VOC/VOB)
Service Service Requirements, After-Purchase Reliability, Parts Availability, Service,
and Safety Warranties, Maintainability, Customer-Required Maintenance, Product
Liability, Product/Service Safety
Ethical Business Conduct, Environmental Impact, Business Risk
Stewardship
Management, Regulatory and Legal Compliance
UNCLASSIFIED / FOUO 11
12. UNCLASSIFIED / FOUO
Output Measures
Referred to as “Y” data. Output Metrics quantify the
overall performance of the process, including:
How well customer needs and requirements were
met (typically Quality & Speed requirements), and
How well business needs and requirements were met
(typically Cost & Speed requirements)
Output measures provide the best overall barometer of
process performance
Focus on one Primary Output (Y) metric at a time. Use
Secondary Y metrics to “keep you honest”
Example: If the Primary Y is to improve cycle time, the Secondary Y could
monitor defects to make sure they also improve or at least don’t get worse!
UNCLASSIFIED / FOUO 12
13. UNCLASSIFIED / FOUO
Typical Output Measures
Possible Output
Process Type Output (Y) Measures
Metal chemistry/thickness/
Ammo propellant weight/ballistics
Product/ Number of missing/incorrect
Manufacturing Dining-in place cards, seating time,
Ceremony delivery time, accuracy
(food/beverage order)
Cycle time, accuracy (# of
Re-enlistment errors), completeness (# of
Service/ Papers items missing)
Transactional/
Administrative Delivery timeliness,
Anthony’s accuracy, temperature
Pizza
UNCLASSIFIED / FOUO 13
14. UNCLASSIFIED / FOUO
X and Y Metrics
Suppliers Inputs Process Outputs Customers
• Billing Dept. staff Billing Process • Delivered
• Customer Invoice
database
• Shipping
information
• Order information
Input Metrics Process Metrics Output Metrics
• Accuracy of • System responsiveness • Rework % at each step • Invoice accuracy
database info. • Accuracy of order info. Quality
• Staff expertise • Accuracy of shipping
• System up-time info.
• Time to receive order info. • # of process steps • Invoice cycle time
Other Metrics • Time to receive shipping • Time to complete invoice
• Invoices information • Time to deliver invoice Speed
processed per • Delay time between steps
month and
variability • # of billing staff • # of process steps • Cost/invoice
Cost
UNCLASSIFIED / FOUO 14
15. UNCLASSIFIED / FOUO
National Guard
Black Belt Training
Develop Data
Collection Plan
UNCLASSIFIED / FOUO
16. UNCLASSIFIED / FOUO
Exercise: Data Collection
Collect Height Data
UNCLASSIFIED / FOUO
17. UNCLASSIFIED / FOUO
Types of Data
Continuous / Variable – Any variable measured on a continuum or
scale that can be infinitely divided into recognizable parts. Primary
types include time, dollars, size, weight, temperature, and speed. Any
metric that can be continuously divided by 2 and the metric still makes
sense is a continuous metric. Continuous Data is always
preferred over Discrete or Attribute Data.
Discrete / Attribute – A count, proportion, or percentage of a
characteristic or category. Service process data is often discrete.
Continuous/Variable Discrete/Attribute
• Cycle time • Late delivery
• Cost or price • Gender
• Length of call • Region/location
• Temperature of rooms • Room type
UNCLASSIFIED / FOUO 17
18. UNCLASSIFIED / FOUO
The Objective: Data Collection Plan
Let’s see how a Data Collection Plan is developed
Data Collection Plan
Performance Operational Data Source How Will Who Will When Will
Measure Data Be Collect Data Data Be Sample Size Stratification Factors
Definition & Location
Collected Collected
Developed
earlier 2 3 4 5 6 1
How will data be used? How will data be displayed?
Examples: Examples:
Identification of Largest Contributors Pareto Chart
Identifying if Data is Normally Distributed Histogram
Identifying Sigma Level and Variation Control Chart
Root Cause Analysis Scatter Diagrams
Correlation Analysis
UNCLASSIFIED / FOUO 18
19. UNCLASSIFIED / FOUO
Step 1. Stratification Factors
What are the ways you need to look at the data?
Data Stratification - Capturing and use of characteristics
to sort data into different categories (also known as “slicing
the data”)
Used to:
Provide clues to root causes (Analyze)
Verify suspected root causes (Analyze)
Uncover times, places where problems are severe (“vital
few”)
Surface suspicious patterns to investigate
UNCLASSIFIED / FOUO 19
20. UNCLASSIFIED / FOUO
Stratification Factors
Factors Examples
What Complaints, Defects
When Month, Day
Where Region, City
Department,
Who
Individual
If you do not collect stratification factors “up front,” you
might have to start all over later. On the other hand, seeking
too many factors makes the data more difficult and/or more
costly to collect.
UNCLASSIFIED / FOUO 20
21. UNCLASSIFIED / FOUO
Stratification Matrix
Key Steps
Fill in the Output measure Y
Fill in the key stratification questions you have about the process in
relationship to the Y
List out all the levels and ways you can look at the data in order to
determine specific areas of concern
Create specific measurements for each subgroup or stratification factor
Review each of the measurements (include the Y measure) and
determine whether or not current data exists
Discuss with the team whether or not these measurements will help to
predict the output Y, if not, think of where to apply the measures so
that they will help you to predict Y
UNCLASSIFIED / FOUO 21
22. UNCLASSIFIED / FOUO
Stratification Matrix
2 3 4
Questions About Process Stratification factors Measurements
X Variables Does data exist
to support
these
measurements
?
(Y/N)
5
Will these
measurements
(Output Y) help to predict
Y? (Y/N)
1
6
UNCLASSIFIED / FOUO 22
23. UNCLASSIFIED / FOUO
Stratification Matrix Example - Checkout
2 3 4
Questions About Process Stratification factors Measurements
X Variables Does data exist
Does the number # adjustments / day to support
adjustment vary over time? By time period
these
# adjustments last year measurements
2 ?
3 4 (Y/N)
Is there a difference by % of adjustments / associate
By employee
5
type of employee? # of adjustments by new
vs. exp. Employees
Total adjustments Will these
at checkout measurements
help to predict
Is there a difference by (Output Y) # adjustments by room size
Y? (Y/N)
type of customer? By type # adjustments by
1
customer segment 6
Does the amount of
adjustments vary from one # adjustments in North East
location to another? By location # adjustments in South
# adjustments in Midwest
UNCLASSIFIED / FOUO 23
24. UNCLASSIFIED / FOUO
Step 2. Developing Operational Definitions
Operational Definitions apply to MANY things we encounter every
day. For example, all the measurement systems we use (feet/inches,
weight, temperature) are based on common definitions that we all
know and accept. Sometimes these are called “standards.”
Other times, our operational definitions are more vague. For example,
when someone says a loan is “closed” they might mean papers have
been sent, but not signed; another person might mean signed but not
funded; a third person might mean funded but not recorded.
While here we are focused on operational definitions in the context of
measurement, the concept applies equally well to “operationally
defining” a customer requirement, a procedure, a regulation, or
anything else that benefits from clear, unambiguous understanding
Learning to pay attention to and clarify operational definitions can be a
major side benefit of the CPI process
UNCLASSIFIED / FOUO 24
25. UNCLASSIFIED / FOUO
Defining “Operational Definitions”
What it is...
A clear, precise description of the factor being
measured
Why it is critical...
So each individual “counts” things the same way
So we can plan how to measure effectively
To ensure common, consistent interpretation of results
So we can operate with a clear understanding and with
fewer surprises
UNCLASSIFIED / FOUO 25
26. UNCLASSIFIED / FOUO
Developing Operational Definitions
From General to Specific:
Step 1 – Translate what you want to know into something
you can count
Step 2 – Create an “air-tight” description of the item or
characteristic to be counted
Step 3 – Test your Operational Definition to make sure it
is truly “air-tight”
Note: Sometimes you will need to do some “digging” up-front
to arrive at good operational definitions. It is usually worth the
effort!!
UNCLASSIFIED / FOUO 26
27. UNCLASSIFIED / FOUO
Step 3. Data Sources
Key Question: Does the data currently exist?
Existing Data – Taking advantage of archived data or current
measures to learn about the Output, Process, or Input
This is preferred when the data is in a form we can use and
the Measurement System is valid (a big assumption and
concern)
New Data – Capturing and recording observations we have not
or do not normally capture
May involve looking at the same “stuff,” but with new
Operational Definitions
This is preferred when the data is readily and quickly
collectable (it has less concerns with measurement problems)
UNCLASSIFIED / FOUO 27
28. UNCLASSIFIED / FOUO
Key Considerations: Existing vs. New Data
Existing vs. New Considerations
Is existing or “historical” data adequate?
Meet the Operational Definition?
Truly representative of the process, group?
Contain enough data to be analyzed?
Gathered with a capable Measurement System?
Cost of gathering new data
Time required to gather new data
The trade-offs made here, I.e. should the time and effort be
taken to gather new data, or only work with what we have, are
significant and can have a dramatic impact on the project
success
UNCLASSIFIED / FOUO 28
29. UNCLASSIFIED / FOUO
Step 4. How will Data Be Collected?
Check Sheets
The workhorse of data collection
Enhance ease of collection
Faster capture
Consistent data from different people
Quicker to compile data
Capture essential descriptors of data
“Stratification factors”
Need to be designed for each job
UNCLASSIFIED / FOUO 29
30. UNCLASSIFIED / FOUO
Data Collection Forms – Check Sheets
Check sheets are convenient for gathering data
Data sheets allow:
Faster, more accurate capture
Consistent data from different people
Quicker, easier compilation
Capture essential descriptors of data
Designed for each different data gathering situation
The data may then be analyzed
UNCLASSIFIED / FOUO 30
31. UNCLASSIFIED / FOUO
Get Data You Can Use
As you set up Check Sheets...
Prepare a spreadsheet to compile the data
Think about how you will do the compiling (and who will do it)
Consider what sorting, graphing, or other reports you will want to create
Continuous or Discrete Data?
Adequate level of discrimination and accuracy?
Adjust check sheet as needed to ensure usable data later
But do not make data harder to collect
UNCLASSIFIED / FOUO 31
32. UNCLASSIFIED / FOUO
Constructing Check Sheets
1. Select specific data and factors to be included
2. Determine time period to be covered by the form
Day, Week, Shift, Quarter, etc.
3. Construct form
Be sure to include:
Clear labels
Enough room
Space for notes
4. Test the form!
UNCLASSIFIED / FOUO 32
33. UNCLASSIFIED / FOUO
Check Sheet Tips
Include name of collector(s) (first and last)
Reason/comment columns should be clear and concise
Use full dates (month, date, year)
Use explanatory title
Consider lowest common denominator on metric
Minutes vs. Hours
Inches vs. Feet
Test and validate your design (try it out)
Do not change form once you have started, or you will be
“starting over!”
UNCLASSIFIED / FOUO 33
34. UNCLASSIFIED / FOUO
Types of Check Sheet: Frequency Plot
Shows “distribution” of
Frequency of Repairs
July
items or occurrences
1
2 X X X X X X X
3 X X X X X
along a scale or ordered
4 X X X X X
5 X X X X
6 X X
7
8
X
X
X X
quantity
9 X X X X X X
10 X X X X
11
12
X
X
X
X
X
X
X
X Helps detect unusual
patterns in a population –
13 X
14 X X X
15
16
17
X
X
X
X
X
X
X
X
X
X
X
or detect multiple
populations
18 X X X X X X X X
19 X X X X
20 X
21 X X X X X
Gives visual picture of
22
23 X X X X X X X X X
24 X X X X X X X
25
26
27
X
X
X
X
X
X
X
X
X
X
X
X “average” and “range”
28 X X X X X
29 X X
30 X X X X X X X X
31 X X X X X X
UNCLASSIFIED / FOUO 34
35. UNCLASSIFIED / FOUO
Types of Check Sheets: Standard
Week of: 6/26 Collected by: Kevin Regan
Repair Complaint Repair
Call Date Call Time Initials Notes
TV Smk Det Thrmstat RemCon Shower Window Time
30-Jun 8:00a EJS X X 10 min
28-Jun 8:15a MWT X 1 hr
27-Jun 7:00p MWT X 15 min
26-Jun 6:30p KLC X 2 hrs
28-Jun 5:45p PP X 30 min
30-Jun 6:00a KR X 40 min
1-Jul 8:15p DRT X 4 hrs Replaced part
1-Jul 8:20p ECS X 2 hrs Not in stock
28-Jun 9:35a MWT X 1 hr
29-Jun 9:40a KLC X 30 min
29-Jun 5:15p EJS X 45 min
29-Jun 5:20p KR X 15 min
UNCLASSIFIED / FOUO 35
36. UNCLASSIFIED / FOUO
Types of Check Sheets – Traveler
Traveler Checksheet
Awards Approval Process
Awardee: __________________________________________________
Award type: □ PCS □ Other ___________________________
Proposed award date: ________________________________________
Recommender’s division:
□ G-1 □ G-2 □ G-3 □ G-4 □ Other __________
Time begun; Time
Process step Defects found
completed
Fill out forms
Approve
recommendation
Schedule presentation
UNCLASSIFIED / FOUO 36
37. UNCLASSIFIED / FOUO
Types of Check Sheets – Confirmation
Example: Power Steering project tracking
UNCLASSIFIED / FOUO 37
39. UNCLASSIFIED / FOUO
Check Sheet Takeaways
A check sheet is an easy way to collect data in order
to observe trends and identify improvement priorities
Mistake-proof data collection by using check boxes,
tallies, or choices that can be circled (reduce any
writing to an absolute minimum – or none at all!)
Remember to include those who understand the
process and those who will actually use the check
sheet in the design of the check sheet. This is very
important for success!
UNCLASSIFIED / FOUO 39
40. UNCLASSIFIED / FOUO
Step 5. Who Will Collect the Data?
Considerations:
Familiarity with the process
Availability/impact on job
Rule of Thumb – If it takes someone more than 15
minutes per day it is not likely to be done
Potential Bias
Will finding “defects” be considered risky or a
“negative?”
Benefits of Data Collection
Will data collection benefit the collector?
UNCLASSIFIED / FOUO 40
41. UNCLASSIFIED / FOUO
Preparing Collectors
Be sure collectors:
Give input on the check sheet design
Understand operational definitions (!)
Understand how data will be tabulated
Helps them see the consequences of changing
Have been trained and allowed to practice
Have knowledge and are unbiased
UNCLASSIFIED / FOUO 41
42. UNCLASSIFIED / FOUO
Step 6. Sampling
Sampling is using a smaller group to represent the
whole population (the foundation of “statistics”)
Benefits:
Saves time and money
Allows for more meaningful data
Simplifies measurement over time
Can improve accuracy
UNCLASSIFIED / FOUO 42
43. UNCLASSIFIED / FOUO
Sampling Considerations
Time
Cost
Accuracy
Units Processed Cost to Collect
Per Day Data
UNCLASSIFIED / FOUO 43
44. UNCLASSIFIED / FOUO
Sampling Types
Population – Drawing from a fixed group with
definable boundaries. No time element.
Customers
Complaints
Items in Warehouse
Process – Sampling from a changing flow of items
moving through the business. Has a time
element.
New customers per week
Hourly complaint volume
Items received or shipped by day
UNCLASSIFIED / FOUO 44
45. UNCLASSIFIED / FOUO
Population or Process Sampling
Of primary importance in a Lean Six Sigma measurement
effort is to clarify if you are engaged in Population or
Process sampling
Most traditional statistical training focuses on sampling
from populations – a group of items or events from which
a representative sample can be drawn. A population
sample looks at the characteristics of the group at a
particular point in time.
Quality and business process improvement tends to focus
more often on processes, where change is a constant
UNCLASSIFIED / FOUO 45
46. UNCLASSIFIED / FOUO
Population or Process Sampling
In process sampling, you measure characteristics of things
or characteristics as they pass through the process, and
observe changes over time
Any data you collect that has “time order” included can be
examined as either a population or a process – however,
the size of the sample analyzed might need to be different
Given a choice, process data gives more information, such
as trends and shifts of short duration. Process sampling
techniques are the foundation of process monitoring and
control.
UNCLASSIFIED / FOUO 46
48. UNCLASSIFIED / FOUO
Sampling Methods/Strategies
The big pitfall in sampling is “bias” – i.e., select a sample that does
NOT really represent the whole. The sampling plan needs to guard
against bias. Different methods of sampling have different advantages
and disadvantages in managing bias.
Judgment
As it sounds – selecting a sample based on someone’s knowledge of
the process, assuming that it will be “representative.” Judgment
guarantees a bias, and should be avoided.
Convenience
Also just like it sounds – sampling those items or at those times
when it is easier to gather the data. (For example, taking data
from people you know, or when you go for coffee.) This is another
common (but ill-advised) approach.
UNCLASSIFIED / FOUO 48
49. UNCLASSIFIED / FOUO
Sampling Strategies
Best Methods:
Random
Best approach for population situations. Use a random
number table or random function in Excel or other software,
or draw numbers from a hat.
Systematic
Most practical and unbiased in a process situation.
“Systematic” means that we select every nth unit, or take
samples at specific times of the day. The risk of bias comes
when the timing of the sample matches a pattern in the
process.
UNCLASSIFIED / FOUO 49
50. UNCLASSIFIED / FOUO
Sampling Strategies Considerations
Should we stratify first? ...
Focus on one group within the process or population?
Ensure adequate representation from various segments
of the population or process?
Does it “feel right?”
Sampling needs to fit common sense considerations
Confront and manage your biases in advance
UNCLASSIFIED / FOUO 50
51. UNCLASSIFIED / FOUO
Key Sampling Terms/Concepts
Sampling Event – The act of extracting items from
the population or process to measure
Subgroup – The number of consecutive units
extracted for measurement at each Sampling Event
(A “subgroup” can be just one!)
Sampling Frequency – Applies only to process
sampling; the number of times per day or week a
sample is taken (i.e., sampling events per period of
time)
These are the key elements to be included in the sampling plan: what we will
“extract,” how many we will take at a time, and how often we will take a sample.
UNCLASSIFIED / FOUO 51
52. UNCLASSIFIED / FOUO
Population Sampling Steps
Building the “Sampling Plan”
1. Develop an initial profile of the data
2. Select a sampling strategy
3. Determine the initial sample size
4. Adjust as needed to determine minimum
sample size
UNCLASSIFIED / FOUO 52
53. UNCLASSIFIED / FOUO
Sampling – Initial Data Profile
Population size (Noted as “N”)
As you begin preparing the Sampling Plan, you first
need to determine the rough size of the total population
Stratification factors
If you elect to conduct a stratified sample, you
need to know the size of each subset or stratum
What precision result do you need?
Next, you need to define the level of precision needed in your
measurement. Precision notes how tightly your measurement will
describe the result. For example, if measuring cycle time, your sample
will be affected by whether you want precision in days (e.g. estimate is
within +/- 2 days) or hours (estimate is within +/- 4 hours). Precision
is noted by the variable “d” or D. The sample size goes up very rapidly
as the precision is tightened.
UNCLASSIFIED / FOUO 53
54. UNCLASSIFIED / FOUO
Sampling – Initial Data Profile
The last step in your initial profile is to estimate the
variation in the population
Continuous data requires an estimate of
the “standard deviation” of the variable
being measured
Continuous data: How much does the
characteristic vary? (estimated standard
deviation)
Discrete data requires an estimate of “P,” the
proportion of the population that contains the
characteristic in question
Discrete data: What proportion contains the characteristic?
UNCLASSIFIED / FOUO 54
55. UNCLASSIFIED / FOUO
Sampling – Sampling Strategy
Random or systematic?
How will we draw the sample?
Who will conduct the “sampling event?”
How will we guard against bias?
Most representative vs. time, effort, and cost
No differences between what you collect and what you
do not collect
UNCLASSIFIED / FOUO 55
56. UNCLASSIFIED / FOUO
Sampling
Some Final Tips ...
When you want to ensure representation from different
groups or strata, prepare a separate sampling plan for
each group
Be sure to maintain the time order of your
samples/subgroups so you can see changes over time
Common sense is a useful tool in sampling
Help is available if you need it!
UNCLASSIFIED / FOUO 56
57. UNCLASSIFIED / FOUO
Test, Refine and Implement
Ensuring “Quality” Measurement
Measurement is rarely perfect – especially at first
Even good measurement can go “bad”
As you use data, lessons might include ...
How to simplify measures
Other stratification factors needed
Ways to improve collection forms
Other measures to investigate
UNCLASSIFIED / FOUO 57
59. UNCLASSIFIED / FOUO
Operational Definitions Template
Define each of the Key Input, Output, Process Metrics from your SIPOC that you are going to
collect data on (via the Data Collection Plan) as well as any other terms that need clarification
for the data collectors and everyone else on the team.
Examples:
Award Process PLT: The time from when a Director submits the Award recommendation to
the time when the employee is presented the Award in a ceremony.
Number of Claims Processed: The number of Claims processed per weekday (M-F).
Total Hours Worked: The total number of hours worked in the facility including weekends
and holidays.
Number of Personnel: The total number of military and civilian personnel working (not
including contractors).
Include other unique terms that apply to your project that require clear operational definitions
for those collecting the data and for those interpreting the data.
Required
UNCLASSIFIED / FOUO
60. UNCLASSIFIED / FOUO
Data Collection Plan Template
Performanc Operational Data How Will Data Be Who Will When Will Sampl Stratificati How will
e Measure Definition Source and Collected Collect Data Be e Size on Factors data be
Location Data Collected used?
1
Ability to update X – Steps to In DEPMS By counting steps Name ASAP 1 None To find VA, BNVA,
projects and update projects NVA
build tollgate
reviews
- Example -
2
Ability to update X – Tollgate In DEPMS By determining % of Name ASAP 40 None To determine
projects and template slides activity steps identified in consistency with
build tollgate that match POI “Introduction to _____” POI
reviews modules in POI that are
adequately addressed in
templates
3
Easy Access to X – Availability of In DEPMS By determining the Name ASAP 63 None To determine
LSS tools and LSS tools and percentage of tools, with availability of tools
references references their references, listed on and references
DMAIC Road Map slides that
can be found in PS
4
Easy Access to X – Steps In DEPMS By counting # steps Name ASAP 37 None To find VA, BNVA,
LSS tools and required to find required to find the tools NVA
references tools and and their references
references
Required Deliverable
UNCLASSIFIED / FOUO
61. UNCLASSIFIED / FOUO
Exercise: Data Collection
Objective
Create a data collection plan for the GGA's Budget
Department
Instructions
Include:
1. Key input, process and output metrics
2. Operational definitions
3. Data collection methods
Time = 30 Minutes
UNCLASSIFIED / FOUO 61
62. UNCLASSIFIED / FOUO
Takeaways
Know what to measure and why
Create a plan to collect output, process and/or input
data
Construct forms and test data collection procedures
using appropriate data sampling methods
Refine data collection
Collect the data
Analyze the data
UNCLASSIFIED / FOUO 62
63. UNCLASSIFIED / FOUO
What other comments or questions
do you have?
UNCLASSIFIED / FOUO
64. UNCLASSIFIED / FOUO
National Guard
Black Belt Training
Appendix
Sample Size Calculations
UNCLASSIFIED / FOUO
65. UNCLASSIFIED / FOUO
How Many Do We Need to Count?
Factors in Sample Size Selection:
Situation: Population or Process
Data Type: Continuous or Discrete
Objectives: What you will do with results
Familiarity: What you guess results will be
Certainty: How much “confidence” you need in your
conclusions
Determine What to Measure and Data Collection UNCLASSIFIED / FOUO 65
66. UNCLASSIFIED / FOUO
Three Factors Drive Sample Sizes
Three concepts affect the conclusions drawn from a
single sample data set of (n) items:
Variation in the underlying population (sigma)
Risk of drawing the wrong conclusions
How small a Difference is significant (delta)
Risk
Variation Difference
UNCLASSIFIED / FOUO 66
67. UNCLASSIFIED / FOUO
Three Factors: Variation, Risk, Difference
These 3 factors work together. Each affects the others.
Variation: When there’s greater variation, a larger
sample is needed to have the same level of
confidence that the test will be valid. More variation
diminishes our confidence level.
Risk: If we want to be more confident that we are not
going to make a decision error or miss a significant
event, we must increase the sample size.
Difference: If we want to be confident that we can
identify a smaller difference between two test
samples, the sample size must increase.
UNCLASSIFIED / FOUO 67
68. UNCLASSIFIED / FOUO
Determining Minimum Sample Size
Minimum sampling size from a population or a stable process can be
estimated from the following formulas:
Continuous Data Sample Size
For continuous data: 2
1.96 s
n=
D
Where: n = minimum sample size required
s = estimate of standard deviation of the
population or process data
D = level of precision desired from the sample
in the same units as the “s” measurement
1.96 = constant representing a 95%
confidence interval
UNCLASSIFIED / FOUO 68
69. UNCLASSIFIED / FOUO
Determining Minimum Sample Size
Discrete Data Sample Size
For discrete or proportion data:
2
1.96
n= P(1 P)
D
Where
n = minimum sample size
P = estimate of the proportion of the population or process
which is defective
D = level of precision desired from the sample in units of
proportion
1.96 = constant representing a 95% confidence interval
The highest value of p(1-p) is 0.25 or p=0.5
Benefits of Continuous Data
Usually requires a smaller sample
More information for stratification and root cause analysis
UNCLASSIFIED / FOUO 69
70. UNCLASSIFIED / FOUO
Formula for Small Populations
Making adjustments in the minimum sample size
required/needed for small populations:
Both sample size formulas assume:
a 95% confidence interval
a small sample size (n) compared to the entire population size (N)
If n/N is greater than 0.05, the sample size should be
adjusted to:
n
n finite =
n
1+
N
The proportion formula should only be used when: nP 5
UNCLASSIFIED / FOUO 70
71. UNCLASSIFIED / FOUO
Formula for Small Populations
Example: Processing CAC applications
Given:
The sample size formula shows that you need a minimum
sample size of 289
You have only processed 200 units
Solution: The correct minimum sample size would be:
n 289
n finite = = = 118.2 or 119 - minimum sample size required
n 289
1+ 1+
N 200
UNCLASSIFIED / FOUO 71
72. UNCLASSIFIED / FOUO
Minimum Sample Size – Continuous Example
Example: Sample Size Calculation – Continuous
A Lean Six Sigma team samples a contracting process to determine
the average processing time and wishes to estimate the average time
within one day. Based on previous sampling, the team has estimated
the standard deviation of the current contract process time as 4 days.
What is the minimum sample size required to be able to estimate the
average with the required precision?
2
1.96s
n=
D
1.96 4
2
n= = 62 contracts
1
UNCLASSIFIED / FOUO 72
73. UNCLASSIFIED / FOUO
Minimum Sample Size – Discrete Example
Example: Sample Size Calculation – Discrete
Another Lean Six Sigma team determines the minimum sample size
required for the proportion of DPW, Department of Public Works,
service contracts that require rework at the approval meeting. From
interviews, the team has concluded that approximately 25% of the
contracts contain errors and require rework. They wish to determine
the % requiring rework within 5%.
2
1.96
n = .25(1 .25)
.05
n =(1536.64)(.1875) = 289 contracts
UNCLASSIFIED / FOUO 73
74. UNCLASSIFIED / FOUO
Exercise:
Sample Size
Objective:
Determine the appropriate sample size
Instructions:
Use the pizza delivery example. The pizza is scheduled for
the time the customer requests delivery.
The customer requirement is +/- 10 minutes from the
scheduled delivery time
Estimated s = 7.16 minutes and D = 2 minutes
Estimated number of defects is 30% ( P = 0.30; D =5%)
Determine the minimum sample size for both continuous
and discrete data
UNCLASSIFIED / FOUO 74
76. UNCLASSIFIED / FOUO
Exercise:
Sample Size
Objective:
Determine the appropriate sample size
Instructions:
Select one output indicator for your process
Determine the type of data (continuous / discrete)
Continuous - estimate “s” and D
Discrete - estimate D and P
Determine the minimum sample size required
UNCLASSIFIED / FOUO 76
77. UNCLASSIFIED / FOUO
Exercise:
Sample Size Formula
Objective:
Determine the appropriate sample size formula to use
Instructions:
At your tables determine the right formula (proportion/discrete or continuous)
to use and calculate the sample size for each situation
1.Estimate the average cycle time within 2 hours. The estimated standard
deviation is 8 hours. What is the minimum number to sample?
2.A team collected 100 observations to determine the proportion defective.
They found 20% to be defective. How accurately can they estimate the
proportion defective?
3.You have a customer survey with 2 categorical questions and 8 interval
statements. You estimate that at least one option of a categorical
question will be answered by approximately 50% of the respondents and
you want to be able to detect a difference within ± 5%. For the
continuous statements you want to be able to detect a difference of at
least ½ a point. The highest estimated standard deviation for any of the
statements is 1.2. You expect the response rate to be 25%. How many
surveys do you have to send out and how many completed surveys do
you need returned?
UNCLASSIFIED / FOUO 77
78. UNCLASSIFIED / FOUO
Answers to Sampling Exercise
2 2
1.96s 1.96(8)
1. Continuous n= = = 62
D 2
2
1 . 96
2. Discrete/Proportioned n = p (1 p )
D
2
1.96
100 = .2(1 .2)
D2
D = .08 or 8%
2
1.96
3. Discrete Calculation n = .5(1 .5) = 385
.05
2
Continuous 1.96(1.2)
n= = 23
.5
Must send out 4* minimum sample or 4*385 = 1,540
UNCLASSIFIED / FOUO 78