More Related Content
Similar to Distributions: Normal with Matt Hansen at StatStuff (20)
More from Matt Hansen (20)
Distributions: Normal with Matt Hansen at StatStuff
- 1. Section & Lesson #:
Pre-Requisite Lessons:
Complex Tools + Clear Teaching = Powerful Results
Distributions: Normal
Six Sigma-Measure – Lesson 9
A review of normal distributions and how to test their normality using a
normality test.
Six Sigma-Measure #08 – Distributions: Overview
Copyright © 2011-2019 by Matthew J. Hansen. All Rights Reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted by any means
(electronic, mechanical, photographic, photocopying, recording or otherwise) without prior permission in writing by the author and/or publisher.
- 2. Normal Distributions (bell curve)
o Why is it called a “normal” distribution?
• “Normal” implies the typical randomness that we expect to occur in life.
• If there was no randomness, then we can presume there is some influence (bias or skewness).
• We want the data that we analyze to be unbiased, therefore we need to ensure it reflects the
“normal” randomness we would expect.
Otherwise if the data is biased, then why analyze it if we can’t be confident we’ll find the right root cause?
o A normal distribution is bell-shaped.
• The bell shape is created because most
of the data points fall in the middle.
• The shape of the bell is influenced by
the mean and standard deviation.
o Characteristics of a normal distribution.
• Completely described by its mean and standard deviation.
• The tails on either end of the curve extend +/- infinity.
• The area under the curve represents 100% of possible observations.
• The curve is symmetrical where 50% of the data points fall on either side of the mean.
• The mean (average) will be relatively equal to the median (50th percentile).
Copyright © 2011-2019 by Matthew J. Hansen. All Rights Reserved. No part of this publication may be
reproduced, stored in a retrieval system, or transmitted by any means (electronic, mechanical, photographic,
photocopying, recording or otherwise) without prior permission in writing by the author and/or publisher.
- 3. Normality Testing
o A distribution with a bell curve doesn’t necessarily mean it’s “normal”.
• It’s not uncommon for a distribution to appear normal, but it really isn’t.
• The normality of the distribution should be statistically tested.
o Use the Anderson-Darling test of a Normality Test or Probability Plot.
• A normality test or probability plot will plot the data on a logarithmic scale.
In Minitab, go to Stat > Basic Statistics > Normality Test or go to Graph > Probability Plot
• Normal data will appear like a straight line; Minitab will try to fit a line along the data points.
A “fat pencil” test is if a fat pencil can lay
across and cover all the data points, then
it’s probably a normal distribution.
• A better test is to examine the p-value
of the Anderson-Darling (AD) test.
If p-value > 0.05, then it’s normal.
If p-value < 0.05, then it’s not normal.
– In these examples, 0.05 is referring to the
alpha risk which is the % chance of being
right when concluding the data is normal.
5% is a commonly acceptable threshold.
Copyright © 2011-2019 by Matthew J. Hansen. All Rights Reserved. No part of this publication may be
reproduced, stored in a retrieval system, or transmitted by any means (electronic, mechanical, photographic,
photocopying, recording or otherwise) without prior permission in writing by the author and/or publisher.
3
This p-value is > 0.05
which means it’s a
normal distribution
- 4. Practical Application
o Open the “Minitab Sample Data.MPJ” file and try to do the following:
• Run a normality test on each continuous metric.
• Which metrics are normally distributed? How can you prove it?
o Next, pull some historical data for at least 2 continuous metrics used by your
organization and try following the same steps described above.
Copyright © 2011-2019 by Matthew J. Hansen. All Rights Reserved. No part of this publication may be
reproduced, stored in a retrieval system, or transmitted by any means (electronic, mechanical, photographic,
photocopying, recording or otherwise) without prior permission in writing by the author and/or publisher.
4