Your SlideShare is downloading. ×
   Setting Expectations   Calculating Measures of central tendency and variation   Skewness and kurtosis   Calculating...
   This is not a training on Six Sigma!!    The training presentation assumes that you are already    aware of Six Sigma ...
In mathematics, the central tendency of a data set is a measure of the"middle" or "expected" value of the data set. There ...
The arithmetic mean (average) of a list of numbers is the sum of all ofthe list divided by the number of items in the list...
A median is described as the number separating the higher half of asample, a population, or a probability distribution, fr...
The mode is the value that occurs the most frequently in a data set or aprobability distribution. The mode is not necessar...
In Statistics, variance is the expected square deviation of a variable ordistribution from its expected value or mean. To ...
Standard deviation is a measure of the variability or dispersion of astatistical population, a data set, or a probability ...
In descriptive statistics, the range is the length of the smallest intervalwhich contains all the data. It is calculated o...
In probability theory and statistics, skewness is a measure of theasymmetry of the probability distribution of a real-valu...
In probability theory and statistics, kurtosis is a measure of the"peakedness" of the probability distribution of a real-v...
If the mean is 85 days and the standard deviation is 5 days,what is the yield if the USL is 90 days?                      ...
=normdist(x,mean,standarddeviation,cumulative)                                                 14
=normdist(x,mean,standarddeviation,cumulative)                                                 15
=normdist(x,mean,standarddeviation,cumulative)                                                 16
=normdist(x,mean,standarddeviation,cumulative)                                                 17
For a pizza delivery center, the mean of the delivery time is20 minutes and the standard deviation is 3.5. What is theirta...
=norminv(probability,mean,standarddeviation)                                               19
=norminv(probability,mean,standarddeviation)                                               20
=norminv(probability,mean,standarddeviation)                                               21
   Data in raw form are usually not easy to use    for decision making     Some type of organization is needed       ▪ T...
A sorted list of data: Shows range (min to max) Provides some signals about variability    within the range May help id...
   Data in raw form (as    collected):     24, 26, 24, 21, 27, 27, 30, 41,    32, 38 Data in ordered array from  smalles...
 A graph of the data in a frequency distribution is  called a histogram The class boundaries (or class midpoints) are  s...
Class     Class                Midpoint Frequency10   but less than   20      15         320   but less than   30      25 ...
27
28
2                                          Choose Histogram                                                    (    Input ...
30
31
   Scatter Diagrams are used for bivariate    numerical data     Bivariate data consists of paired observations     take...
1Select the Insert Menu  tab2Select Scatter plot  dropdown and  click on any of  the options. If in  doubt, select the  fi...
Volume    Cost per                                              Cost per Day vs. Production Volumeper day     day  23     ...
35
36
Microsoft Exceldescriptive statistics output, using the house price data:    House Prices:    $2,000,000       500,000    ...
   Select    Data Analysis   Choose Correlation from    the selection menu   Click OK . . .                            ...
   Input data range and select    appropriate options   Click OK to get output                                  39
 Select the  input range s  from the data Select the  residuals  pattern. If  you are not  sure, just  select line fit  ...
Regression StatisticsMultiple R                 0.76211    The regression equation is:R Square                   0.58082Ad...
42
Using Microsoft excel for six sigma
Upcoming SlideShare
Loading in...5
×

Using Microsoft excel for six sigma

16,346

Published on

How to use Microsoft Excel more effectively for your six sigma project.

Published in: Education

Transcript of "Using Microsoft excel for six sigma"

  1. 1.  Setting Expectations Calculating Measures of central tendency and variation Skewness and kurtosis Calculating area under normal curve Sorting data Histogram Pareto Chart Scatter diagrams Bar and Pie charts Using Analysis Toolpak for advanced functions 2
  2. 2.  This is not a training on Six Sigma!! The training presentation assumes that you are already aware of Six Sigma concepts, and are looking for ways to implement the same using MS Excel. The training presentation also assumes that you know the basics of MS Excel, and hence it focuses on some advanced analytical concepts. The excel tips and tools mentioned in this presentation can be used in multiple phases of the DMAIC order. So, the presentation does not follow a DMAIC flow of thought. The training is based on MS Excel 2007. Improvise a little when you are using MS Excel 2003. 3
  3. 3. In mathematics, the central tendency of a data set is a measure of the"middle" or "expected" value of the data set. There are many differentdescriptive statistics that can be chosen as a measurement of thecentral tendency of the data items. These include mean, the medianand the mode.Other statistical measures such as the standard deviation and the rangeare called measures of spread and describe how spread out the data is. 4
  4. 4. The arithmetic mean (average) of a list of numbers is the sum of all ofthe list divided by the number of items in the list.To obtain the arithmetic mean from a dataset, use the excel function“Average”. Click below for the syntax for using the function. Click for the syntax Syntax =AVERAGE(number1,number2,...) 5
  5. 5. A median is described as the number separating the higher half of asample, a population, or a probability distribution, from the lower half.If there is an even number of observations, the median is not unique, soone often takes the mean of the two middle values. Click for the syntax Syntax =MEDIAN(number1,number2,...) 6
  6. 6. The mode is the value that occurs the most frequently in a data set or aprobability distribution. The mode is not necessarily unique, since thesame maximum frequency may be attained at different values. Click for the syntax Syntax =mode(number1,number2,...) 7
  7. 7. In Statistics, variance is the expected square deviation of a variable ordistribution from its expected value or mean. To obtain variance from adistribution, excel uses the function “=var”. Click below for the syntax. Click for the syntax Syntax =VAR(number1,number2,...) 8
  8. 8. Standard deviation is a measure of the variability or dispersion of astatistical population, a data set, or a probability distribution. Tocalculate Standard Deviation in an excel worksheet, we use thefunction, “=stdev”. Click for the syntax Syntax =STDEV(number1,number2,...) 9
  9. 9. In descriptive statistics, the range is the length of the smallest intervalwhich contains all the data. It is calculated on excel by subtracting theMin from the max value of the sample. Click below for the syntax. Click for the syntax Syntax =max(A2:A16)-Min(A2:A16) 10
  10. 10. In probability theory and statistics, skewness is a measure of theasymmetry of the probability distribution of a real-valued randomvariable. It is measured in Six Sigma because, in reality, data points arealways not perfectly symmetric. Click for the syntax Syntax =skew(A2:A16) 11
  11. 11. In probability theory and statistics, kurtosis is a measure of the"peakedness" of the probability distribution of a real-valued randomvariable. Click for the syntax Syntax =kurt(A2:A16) 12
  12. 12. If the mean is 85 days and the standard deviation is 5 days,what is the yield if the USL is 90 days? USL Z = (90 − 85) / 5 = 1 Area under curve to Y = Pr( x ≤ 90) = Pr( z ≤ 1) right of USL would be considered % defective P(z<1) = P(z>-1) = 1-.15865 = .8413 Yield ≅ 84.1% Yield 60 70 80 90 100 110 120 D a ys -7 -6 - -4 -3 -2 - 0 2 3 4 5 6 7 5 1 1 Z-Scale 13
  13. 13. =normdist(x,mean,standarddeviation,cumulative) 14
  14. 14. =normdist(x,mean,standarddeviation,cumulative) 15
  15. 15. =normdist(x,mean,standarddeviation,cumulative) 16
  16. 16. =normdist(x,mean,standarddeviation,cumulative) 17
  17. 17. For a pizza delivery center, the mean of the delivery time is20 minutes and the standard deviation is 3.5. What is theirtarget, if the probability of achieving the target is 99.78%? USL Yield Hours a s 18
  18. 18. =norminv(probability,mean,standarddeviation) 19
  19. 19. =norminv(probability,mean,standarddeviation) 20
  20. 20. =norminv(probability,mean,standarddeviation) 21
  21. 21.  Data in raw form are usually not easy to use for decision making  Some type of organization is needed ▪ Table ▪ Graph Techniques reviewed here:  Ordered Array  Histograms  Bar charts and pie charts  Contingency tables 22
  22. 22. A sorted list of data: Shows range (min to max) Provides some signals about variability within the range May help identify outliers (unusual observations) If the data set is large, the ordered array is less useful 23
  23. 23.  Data in raw form (as collected): 24, 26, 24, 21, 27, 27, 30, 41, 32, 38 Data in ordered array from smallest to largest: 21, 24, 24, 26, 27, 27, 30, 32, 38, 41 24
  24. 24.  A graph of the data in a frequency distribution is called a histogram The class boundaries (or class midpoints) are shown on the horizontal axis the vertical axis is either frequency, relative frequency, or percentage Bars of the appropriate heights are used to represent the number of observations within each class 25
  25. 25. Class Class Midpoint Frequency10 but less than 20 15 320 but less than 30 25 630 but less than 40 35 5 Histogram : Daily High Tem perature40 but less than 50 45 4 7 650 but less than 60 55 2 6 5 Frequency 5 4 4 3 3 2 2 (No gaps 1 0 0 between 0 bars) 5 15 25 35 45 55 More 26
  26. 26. 27
  27. 27. 28
  28. 28. 2 Choose Histogram ( Input data range and bin range (bin range is a cell range containing the upper class boundaries for3 each class grouping) Select Chart Output and click “OK” 29
  29. 29. 30
  30. 30. 31
  31. 31.  Scatter Diagrams are used for bivariate numerical data  Bivariate data consists of paired observations taken from two numerical variables The Scatter Diagram:  one variable is measured on the vertical axis and the other variable is measured on the horizontal axis 32
  32. 32. 1Select the Insert Menu tab2Select Scatter plot dropdown and click on any of the options. If in doubt, select the first option (scatter with only markers) 33
  33. 33. Volume Cost per Cost per Day vs. Production Volumeper day day 23 125 250 26 140 200 29 146 Cost per Day 150 33 160 38 167 100 42 170 50 50 188 0 55 195 0 10 20 30 40 50 60 70 60 200 Volume per Day 34
  34. 34. 35
  35. 35. 36
  36. 36. Microsoft Exceldescriptive statistics output, using the house price data: House Prices: $2,000,000 500,000 300,000 100,000 100,000 37
  37. 37.  Select Data Analysis Choose Correlation from the selection menu Click OK . . . 38
  38. 38.  Input data range and select appropriate options Click OK to get output 39
  39. 39.  Select the input range s from the data Select the residuals pattern. If you are not sure, just select line fit plots. 40
  40. 40. Regression StatisticsMultiple R 0.76211 The regression equation is:R Square 0.58082Adjusted R Square 0.52842 house price = 98.24833 + 0.10977 (square feet)Standard Error 41.33032Observations 10ANOVA df SS MS F Significance FRegression 1 18934.9348 18934.9348 11.0848 0.01039Residual 8 13665.5652 1708.1957Total 9 32600.5000 Coefficients Standard Error t Stat P-value Lower 95% Upper 95%Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580 41
  41. 41. 42

×