SlideShare a Scribd company logo
An Importance Sampling Experiment in
a Six Sigma Context
Report from MS&E 408 Directed Study
Stanford University May 2014
Draft: Not to be quoted without authors’ permission
Sam Savage, Kevin Danser, Gene Wiggs
Consider a defect that results when one uncertainty (Stress) exceeds another
uncertainty (Strength). Assuming that on average Strength is greater than
Stress, defects only occurs for values of Strength in the left tail and Stress in
the right tail of their respective distributions. For normal distributions the
defect rate may be found theoretically, but for many distributions the result
must be found using simulation. Because defects are rare, many simulation
trials may be needed for dependable results. Importance Sampling is a
method for generating simulation samples from the “Important” part of the
underlying distribution so as to increase the likelihood of a defect,
improving the dependability of the simulation. Without Importance
Sampling, the probability of a defect is estimated as
D/N
Where D is the number of defects that occurred in the simulation, and N is
the number of simulation trials. When Importance Sampling is used, we
must correct for the fact that only part of the distribution was used. For
example, if the Importance Samples were drawn from a part of the
distribution that had probability measure P, then the estimate of defect rate is
Equation 1. (DI*P)/N
Where DI is the number of defects that occurred in the Importance Sampling
simulation.
Our experiment performed Importance Sampling on Normal Distributions,
to provide a theoretical benchmark for comparison. Starting with a model
provided by Gene Wiggs of General Electric, an interactive simulation was
created in Microsoft Excel, based on the open SIPmath™ standard from
ProbabilityManagement.org. When any of the green cells in the model (IS
Defect.xlsx) are changed, 100,000 trials are run through the model using a
Data Table, whereupon the resulting number of defects are displayed. A
simple approach to Importance Sampling was used, in which we restricted
trials to the tails of the distributions. We referred to the increased factor of
trials in the tail as the Tail Factor. For example tail factors of 1, 2 and 5 are
displayed below.
TF1=TF2= 1, no Importance Sampling
TF1=TF2= 2. All samples drawn from 50% extremes.
TF1=TF2= 5. All samples drawn from 20% extremes
Symmetric Data
We started off the model using symmetric data, having both the standard
deviations for stress and strength equal to each other at 2. The mean scores
for stress and strength were specified as 30 and 40 respectively.
We used the theoretical yield equation, (AverageSTRESS) –
(AverageSTRENGTH) / square root (StressSD
^2
+ StrengthSD
^2
), to determine a
value for the theoretical Z score. This theoretical Z score helped us
determine a theoretical failure rate by using the NORMSDIST function in
Excel to determine a probability that a random variable will be less than or
equal to the theoretical z score. Then we subtracted this probability from 1 to
determine the theoretical failure rate, which for the means and sigma values
is 2.03E-04.
The number of defects (DI) is the sum of failures that occurred for that
specific tail factor.
Taking the average number of defects (DI/N) for that specific tail factor and
dividing it by the product of the two tail factors is the equivalent of Equation
1 above, resulting in the defect rate. Note that in this case, P in equation 1 is
the probability that a sample drawn from the joint distribution of Stress and
Strength would simultaneously lie in the specified tails of both marginal
distributions, or (1/TF1)*(1/TF2).
With Tail Factor = 1, the number of defects out of 100,000 trials (DI) = 20,
and simulated failure rate is 2.00E-04. You may scroll through the individual
100,000 trials with cell K5 (if you have the patience).
With Tail Factor = 2, the number of defects out of 100,000 trials (DI) = 90,
with a simulated failure rate of (90/100,000)*(1/2)*(1/2) = 2.25E-04.
Since our first experiment has equal sigma values we set the tail factors for
stress and strength equal to each other to perform a one-dimensional
experiment. Initially we increased the tail factors by increments of one until
we reached 10. Then increase the tail factors by increments of 10 until we
reached 50. From tail factors 1-7 the failure rates fluctuate. Then from
between 7 and 10, the failure rate stabilizes to close to the theoretical value.
Then for tail factors beyond 10, the failure rate steadily falls. This is because
the maximum strength values lie below many of the stress values and the
miminum stress values lie above many of the strength values. This causes the
calculation to become dominated by the terms (1/TF1)*(TF2). This is
displayed in the graph below for TF1=TF2 = 50.
TF1=TF2=50
The graph below displays the experimental results. Note the region of
convergence between TF = 7 and 10.
We then zoomed in on the tail factors from 1-10 and their respective failure
rates. So we took increments of 0.25 and graphed their results.
The results again were fairly unstable for the failure rates from tail factors 1-
7, but between 7 and 10 there was apparent convergence to approximately
the theoretical failure rate, before the steady decline starting at 10, as noted
above.
This suggests an algorithm for picking tail factors that iterates until a steady
downward trend is detected.
Asymmetric Data
Moving on from the symmetric data (equal sigma values), we wanted to
measure what will happen to the failure rates when we set stress and strength
standard deviations to separate values. The standard deviation for stress was
2.8 while the standard deviation for strength was 4.06. With means of 30 and
51.22 respectively.
At first we set both tail factors equal to each other to get the one-
dimensional approach on the failure rate. Initially we started at 1 and went
up to 30 by increments of 1. Then we increased the tail factor to 50 by
increments of 10.
The failure rates until around TF1=TF2 = 5 are too unstable to determine
results from. From 6 to 16 are quite stable, at close to the theoretical rate.
But from 16-50 there is a steady decrease in the failure rate indicates that the
importance sampling has become excessive.
To further our experiment we expanded our design space beyond the 45
degree line. We started off with both tail factors equal to 5 then explored
combinations the tail factors summing to 10; for example tail factors 2:8,
3:7, 4:6 and etc. This was repeated for TF1+TF2=20 and TF1+F2=32 again
experimenting away from the diagonal, as shown below.
The resulting simulated failure rate are shown below.
Conclusions
Our experiments indicate that Importance Sampling for this problem is quite
sensitive to the tail factor. However, by searching for a monotonic decrease
in failure rate, it appears that an algorithm could optimize the tail factor to
achieve stability in the symmetric case of equal sigma values.
The off diagonal experiments indicate a clear pattern of decreasing
simulated failure rate as the tail factor for the small sigma distribution is
increased relative to the tail factor for the large sigma distribution. This
indicates a fruitful area of further study.

More Related Content

What's hot

Use of Linear Regression in Machine Learning for Ranking
Use of Linear Regression in Machine Learning for RankingUse of Linear Regression in Machine Learning for Ranking
Use of Linear Regression in Machine Learning for Ranking
ijsrd.com
 
Estimation
EstimationEstimation
Estimation
rishi.indian
 
Goodness of Fit Notation
Goodness of Fit NotationGoodness of Fit Notation
Goodness of Fit Notation
Long Beach City College
 
F-Distribution
F-DistributionF-Distribution
F-Distribution
Sofia Marie Escultura
 
Ds vs Is discuss 3.1
Ds vs Is discuss 3.1Ds vs Is discuss 3.1
Ds vs Is discuss 3.1
Makati Science High School
 
Chi square test
Chi square testChi square test
Chi square test
asmhemu
 
Point Estimation
Point EstimationPoint Estimation
Point Estimation
DataminingTools Inc
 
One-Way ANOVA: Conceptual Foundations
One-Way ANOVA: Conceptual FoundationsOne-Way ANOVA: Conceptual Foundations
One-Way ANOVA: Conceptual Foundations
smackinnon
 
Chap07 interval estimation
Chap07 interval estimationChap07 interval estimation
Chap07 interval estimation
Uni Azza Aunillah
 
Practice test ch 10 correlation reg ch 11 gof ch12 anova
Practice test ch 10 correlation reg ch 11 gof ch12 anovaPractice test ch 10 correlation reg ch 11 gof ch12 anova
Practice test ch 10 correlation reg ch 11 gof ch12 anova
Long Beach City College
 
Chapter09
Chapter09Chapter09
Chapter09
rwmiller
 
Contingency Tables
Contingency TablesContingency Tables
Contingency Tables
Long Beach City College
 
150860106054 theory of errors
150860106054 theory of errors150860106054 theory of errors
150860106054 theory of errors
Riya2001998
 
Generalized Linear Models for Between-Subjects Designs
Generalized Linear Models for Between-Subjects DesignsGeneralized Linear Models for Between-Subjects Designs
Generalized Linear Models for Between-Subjects Designs
smackinnon
 
Statistical analysis in analytical chemistry
Statistical analysis in analytical chemistryStatistical analysis in analytical chemistry
Statistical analysis in analytical chemistry
Jethro Masangkay
 
Stat sample test ch 11
Stat sample test ch 11Stat sample test ch 11
Stat sample test ch 11
Long Beach City College
 
Methods of point estimation
Methods of point estimationMethods of point estimation
Methods of point estimation
Suruchi Somwanshi
 
Solution to the practice test ch 10 correlation reg ch 11 gof ch12 anova
Solution to the practice test ch 10 correlation reg ch 11 gof ch12 anovaSolution to the practice test ch 10 correlation reg ch 11 gof ch12 anova
Solution to the practice test ch 10 correlation reg ch 11 gof ch12 anova
Long Beach City College
 
Standard deviation and standard error
Standard deviation and standard errorStandard deviation and standard error
Standard deviation and standard error
Shahla Yasmin
 

What's hot (19)

Use of Linear Regression in Machine Learning for Ranking
Use of Linear Regression in Machine Learning for RankingUse of Linear Regression in Machine Learning for Ranking
Use of Linear Regression in Machine Learning for Ranking
 
Estimation
EstimationEstimation
Estimation
 
Goodness of Fit Notation
Goodness of Fit NotationGoodness of Fit Notation
Goodness of Fit Notation
 
F-Distribution
F-DistributionF-Distribution
F-Distribution
 
Ds vs Is discuss 3.1
Ds vs Is discuss 3.1Ds vs Is discuss 3.1
Ds vs Is discuss 3.1
 
Chi square test
Chi square testChi square test
Chi square test
 
Point Estimation
Point EstimationPoint Estimation
Point Estimation
 
One-Way ANOVA: Conceptual Foundations
One-Way ANOVA: Conceptual FoundationsOne-Way ANOVA: Conceptual Foundations
One-Way ANOVA: Conceptual Foundations
 
Chap07 interval estimation
Chap07 interval estimationChap07 interval estimation
Chap07 interval estimation
 
Practice test ch 10 correlation reg ch 11 gof ch12 anova
Practice test ch 10 correlation reg ch 11 gof ch12 anovaPractice test ch 10 correlation reg ch 11 gof ch12 anova
Practice test ch 10 correlation reg ch 11 gof ch12 anova
 
Chapter09
Chapter09Chapter09
Chapter09
 
Contingency Tables
Contingency TablesContingency Tables
Contingency Tables
 
150860106054 theory of errors
150860106054 theory of errors150860106054 theory of errors
150860106054 theory of errors
 
Generalized Linear Models for Between-Subjects Designs
Generalized Linear Models for Between-Subjects DesignsGeneralized Linear Models for Between-Subjects Designs
Generalized Linear Models for Between-Subjects Designs
 
Statistical analysis in analytical chemistry
Statistical analysis in analytical chemistryStatistical analysis in analytical chemistry
Statistical analysis in analytical chemistry
 
Stat sample test ch 11
Stat sample test ch 11Stat sample test ch 11
Stat sample test ch 11
 
Methods of point estimation
Methods of point estimationMethods of point estimation
Methods of point estimation
 
Solution to the practice test ch 10 correlation reg ch 11 gof ch12 anova
Solution to the practice test ch 10 correlation reg ch 11 gof ch12 anovaSolution to the practice test ch 10 correlation reg ch 11 gof ch12 anova
Solution to the practice test ch 10 correlation reg ch 11 gof ch12 anova
 
Standard deviation and standard error
Standard deviation and standard errorStandard deviation and standard error
Standard deviation and standard error
 

Similar to Importance Sampling Report Report a (1)

MLlectureMethod.ppt
MLlectureMethod.pptMLlectureMethod.ppt
MLlectureMethod.ppt
butest
 
MLlectureMethod.ppt
MLlectureMethod.pptMLlectureMethod.ppt
MLlectureMethod.ppt
butest
 
Probability Assignment Help
Probability Assignment HelpProbability Assignment Help
Probability Assignment Help
Statistics Assignment Help
 
MACHINE LEARNING PPT K MEANS CLUSTERING.
MACHINE LEARNING PPT K MEANS CLUSTERING.MACHINE LEARNING PPT K MEANS CLUSTERING.
MACHINE LEARNING PPT K MEANS CLUSTERING.
AmnaArooj13
 
Two Sample Tests
Two Sample TestsTwo Sample Tests
Two Sample Tests
sanketd1983
 
1. Outline the differences between Hoarding power and Encouraging..docx
1. Outline the differences between Hoarding power and Encouraging..docx1. Outline the differences between Hoarding power and Encouraging..docx
1. Outline the differences between Hoarding power and Encouraging..docx
paynetawnya
 
lecture8.ppt
lecture8.pptlecture8.ppt
lecture8.ppt
AlokKumar969617
 
Lecture8
Lecture8Lecture8
Lecture8
giftcertificate
 
Project2
Project2Project2
Project2
Linjun Li
 
1Running head RESEARCH PROJECT PROPOSAL 13RESEA.docx
1Running head RESEARCH PROJECT PROPOSAL    13RESEA.docx1Running head RESEARCH PROJECT PROPOSAL    13RESEA.docx
1Running head RESEARCH PROJECT PROPOSAL 13RESEA.docx
felicidaddinwoodie
 
panel regression.pptx
panel regression.pptxpanel regression.pptx
panel regression.pptx
Victoria Bozhenko
 
Factorial Experiments
Factorial ExperimentsFactorial Experiments
Factorial Experiments
HelpWithAssignment.com
 
DataIDSalaryCompaMidpoint AgePerformance RatingServiceGenderRaiseD.docx
DataIDSalaryCompaMidpoint AgePerformance RatingServiceGenderRaiseD.docxDataIDSalaryCompaMidpoint AgePerformance RatingServiceGenderRaiseD.docx
DataIDSalaryCompaMidpoint AgePerformance RatingServiceGenderRaiseD.docx
theodorelove43763
 
04 a ch03_programacion
04 a ch03_programacion04 a ch03_programacion
04 a ch03_programacion
universidad del valle colombia
 
Basic Inference Analysis
Basic Inference AnalysisBasic Inference Analysis
Basic Inference Analysis
Ameen AboDabash
 
The t test
The t testThe t test
Two Variances or Standard Deviations
Two Variances or Standard DeviationsTwo Variances or Standard Deviations
Two Variances or Standard Deviations
Long Beach City College
 
Advanced Statistics Homework Help
Advanced Statistics Homework HelpAdvanced Statistics Homework Help
Advanced Statistics Homework Help
Excel Homework Help
 
The T-test
The T-testThe T-test
The T-test
ZyrenMisaki
 
SURE Model_Panel data.pptx
SURE Model_Panel data.pptxSURE Model_Panel data.pptx
SURE Model_Panel data.pptx
GeetaShreeprabha
 

Similar to Importance Sampling Report Report a (1) (20)

MLlectureMethod.ppt
MLlectureMethod.pptMLlectureMethod.ppt
MLlectureMethod.ppt
 
MLlectureMethod.ppt
MLlectureMethod.pptMLlectureMethod.ppt
MLlectureMethod.ppt
 
Probability Assignment Help
Probability Assignment HelpProbability Assignment Help
Probability Assignment Help
 
MACHINE LEARNING PPT K MEANS CLUSTERING.
MACHINE LEARNING PPT K MEANS CLUSTERING.MACHINE LEARNING PPT K MEANS CLUSTERING.
MACHINE LEARNING PPT K MEANS CLUSTERING.
 
Two Sample Tests
Two Sample TestsTwo Sample Tests
Two Sample Tests
 
1. Outline the differences between Hoarding power and Encouraging..docx
1. Outline the differences between Hoarding power and Encouraging..docx1. Outline the differences between Hoarding power and Encouraging..docx
1. Outline the differences between Hoarding power and Encouraging..docx
 
lecture8.ppt
lecture8.pptlecture8.ppt
lecture8.ppt
 
Lecture8
Lecture8Lecture8
Lecture8
 
Project2
Project2Project2
Project2
 
1Running head RESEARCH PROJECT PROPOSAL 13RESEA.docx
1Running head RESEARCH PROJECT PROPOSAL    13RESEA.docx1Running head RESEARCH PROJECT PROPOSAL    13RESEA.docx
1Running head RESEARCH PROJECT PROPOSAL 13RESEA.docx
 
panel regression.pptx
panel regression.pptxpanel regression.pptx
panel regression.pptx
 
Factorial Experiments
Factorial ExperimentsFactorial Experiments
Factorial Experiments
 
DataIDSalaryCompaMidpoint AgePerformance RatingServiceGenderRaiseD.docx
DataIDSalaryCompaMidpoint AgePerformance RatingServiceGenderRaiseD.docxDataIDSalaryCompaMidpoint AgePerformance RatingServiceGenderRaiseD.docx
DataIDSalaryCompaMidpoint AgePerformance RatingServiceGenderRaiseD.docx
 
04 a ch03_programacion
04 a ch03_programacion04 a ch03_programacion
04 a ch03_programacion
 
Basic Inference Analysis
Basic Inference AnalysisBasic Inference Analysis
Basic Inference Analysis
 
The t test
The t testThe t test
The t test
 
Two Variances or Standard Deviations
Two Variances or Standard DeviationsTwo Variances or Standard Deviations
Two Variances or Standard Deviations
 
Advanced Statistics Homework Help
Advanced Statistics Homework HelpAdvanced Statistics Homework Help
Advanced Statistics Homework Help
 
The T-test
The T-testThe T-test
The T-test
 
SURE Model_Panel data.pptx
SURE Model_Panel data.pptxSURE Model_Panel data.pptx
SURE Model_Panel data.pptx
 

Importance Sampling Report Report a (1)

  • 1. An Importance Sampling Experiment in a Six Sigma Context Report from MS&E 408 Directed Study Stanford University May 2014 Draft: Not to be quoted without authors’ permission Sam Savage, Kevin Danser, Gene Wiggs Consider a defect that results when one uncertainty (Stress) exceeds another uncertainty (Strength). Assuming that on average Strength is greater than Stress, defects only occurs for values of Strength in the left tail and Stress in the right tail of their respective distributions. For normal distributions the defect rate may be found theoretically, but for many distributions the result must be found using simulation. Because defects are rare, many simulation trials may be needed for dependable results. Importance Sampling is a method for generating simulation samples from the “Important” part of the underlying distribution so as to increase the likelihood of a defect, improving the dependability of the simulation. Without Importance Sampling, the probability of a defect is estimated as D/N Where D is the number of defects that occurred in the simulation, and N is the number of simulation trials. When Importance Sampling is used, we must correct for the fact that only part of the distribution was used. For example, if the Importance Samples were drawn from a part of the distribution that had probability measure P, then the estimate of defect rate is Equation 1. (DI*P)/N Where DI is the number of defects that occurred in the Importance Sampling simulation. Our experiment performed Importance Sampling on Normal Distributions, to provide a theoretical benchmark for comparison. Starting with a model provided by Gene Wiggs of General Electric, an interactive simulation was created in Microsoft Excel, based on the open SIPmath™ standard from
  • 2. ProbabilityManagement.org. When any of the green cells in the model (IS Defect.xlsx) are changed, 100,000 trials are run through the model using a Data Table, whereupon the resulting number of defects are displayed. A simple approach to Importance Sampling was used, in which we restricted trials to the tails of the distributions. We referred to the increased factor of trials in the tail as the Tail Factor. For example tail factors of 1, 2 and 5 are displayed below. TF1=TF2= 1, no Importance Sampling TF1=TF2= 2. All samples drawn from 50% extremes.
  • 3. TF1=TF2= 5. All samples drawn from 20% extremes Symmetric Data We started off the model using symmetric data, having both the standard deviations for stress and strength equal to each other at 2. The mean scores for stress and strength were specified as 30 and 40 respectively. We used the theoretical yield equation, (AverageSTRESS) – (AverageSTRENGTH) / square root (StressSD ^2 + StrengthSD ^2 ), to determine a value for the theoretical Z score. This theoretical Z score helped us determine a theoretical failure rate by using the NORMSDIST function in Excel to determine a probability that a random variable will be less than or equal to the theoretical z score. Then we subtracted this probability from 1 to determine the theoretical failure rate, which for the means and sigma values is 2.03E-04. The number of defects (DI) is the sum of failures that occurred for that specific tail factor. Taking the average number of defects (DI/N) for that specific tail factor and dividing it by the product of the two tail factors is the equivalent of Equation 1 above, resulting in the defect rate. Note that in this case, P in equation 1 is the probability that a sample drawn from the joint distribution of Stress and Strength would simultaneously lie in the specified tails of both marginal distributions, or (1/TF1)*(1/TF2).
  • 4. With Tail Factor = 1, the number of defects out of 100,000 trials (DI) = 20, and simulated failure rate is 2.00E-04. You may scroll through the individual 100,000 trials with cell K5 (if you have the patience). With Tail Factor = 2, the number of defects out of 100,000 trials (DI) = 90, with a simulated failure rate of (90/100,000)*(1/2)*(1/2) = 2.25E-04. Since our first experiment has equal sigma values we set the tail factors for stress and strength equal to each other to perform a one-dimensional experiment. Initially we increased the tail factors by increments of one until we reached 10. Then increase the tail factors by increments of 10 until we reached 50. From tail factors 1-7 the failure rates fluctuate. Then from between 7 and 10, the failure rate stabilizes to close to the theoretical value. Then for tail factors beyond 10, the failure rate steadily falls. This is because the maximum strength values lie below many of the stress values and the miminum stress values lie above many of the strength values. This causes the calculation to become dominated by the terms (1/TF1)*(TF2). This is displayed in the graph below for TF1=TF2 = 50.
  • 5. TF1=TF2=50 The graph below displays the experimental results. Note the region of convergence between TF = 7 and 10. We then zoomed in on the tail factors from 1-10 and their respective failure rates. So we took increments of 0.25 and graphed their results.
  • 6. The results again were fairly unstable for the failure rates from tail factors 1- 7, but between 7 and 10 there was apparent convergence to approximately the theoretical failure rate, before the steady decline starting at 10, as noted above. This suggests an algorithm for picking tail factors that iterates until a steady downward trend is detected. Asymmetric Data Moving on from the symmetric data (equal sigma values), we wanted to measure what will happen to the failure rates when we set stress and strength standard deviations to separate values. The standard deviation for stress was 2.8 while the standard deviation for strength was 4.06. With means of 30 and 51.22 respectively.
  • 7. At first we set both tail factors equal to each other to get the one- dimensional approach on the failure rate. Initially we started at 1 and went up to 30 by increments of 1. Then we increased the tail factor to 50 by increments of 10. The failure rates until around TF1=TF2 = 5 are too unstable to determine results from. From 6 to 16 are quite stable, at close to the theoretical rate. But from 16-50 there is a steady decrease in the failure rate indicates that the importance sampling has become excessive.
  • 8. To further our experiment we expanded our design space beyond the 45 degree line. We started off with both tail factors equal to 5 then explored combinations the tail factors summing to 10; for example tail factors 2:8, 3:7, 4:6 and etc. This was repeated for TF1+TF2=20 and TF1+F2=32 again experimenting away from the diagonal, as shown below. The resulting simulated failure rate are shown below.
  • 9.
  • 10. Conclusions Our experiments indicate that Importance Sampling for this problem is quite sensitive to the tail factor. However, by searching for a monotonic decrease in failure rate, it appears that an algorithm could optimize the tail factor to achieve stability in the symmetric case of equal sigma values. The off diagonal experiments indicate a clear pattern of decreasing simulated failure rate as the tail factor for the small sigma distribution is increased relative to the tail factor for the large sigma distribution. This indicates a fruitful area of further study.