Classifying data to convey meaningAnubha – Quality Trainer<br />
Frequency distribution<br />Raw data are arranged into classes and frequencies.<br />Classes have represent grouping which contains LL lower limit and UL upper limit<br />Against each class, you count and place number called frequency.<br />Range = Max-Min<br />No.of classes – square root of observation. Classes should not <5and not more than >15<br />
Histogram<br />Also called frequency histogram<br />It’s a graphical representation, X-axis is class and y-axis represent frequency.<br />When pattern is symmetrical and bell shaped then reflects normal distribution.<br />It is used for assessing material strength, estimating process capabilities, indicative corrective action and comparing.<br />
Central Tendency<br />Whenever we measure things of large group , we tend to cluster around the midvalue<br />The most widely used measurement<br />Mean<br />Median<br />Mode<br />
Mean<br />Sum total of observation in set divided by total no of observation.<br />n is total no of observation<br />
Median<br />Median is the middle most observation when you arrange data in ascending or descending data. <br />If the sample size is an odd number then median is (n+1)/2th value in ranked data.<br />If the sample size is even, then median will be between two middle value, you take average to these two middle values.<br />
Mode<br />Mode is that value which occurs most often.<br />It has max frequency occurrence.<br />
Measure of Dispersion - Variation<br />It indicates how large the spread of the distribution around the central tendency.<br />Popular measure of dispersion<br />Range<br />Inter-quartile range<br />Mean Absolute deviation (MAD)<br />Standard deviation<br />Coefficient of Variation<br />
Range<br />It is the simplest of all measures of dispersion.<br />Calculated as the difference between max and min value in the data sheet.<br />Range is a popular measure of variation in quality control application.<br />
Inter-quartile range<br />Range is entirely dependant on max and min values in the data set a misleading when one of them is an extreme value. <br />To overcome, you can resort to inter-quartile range. It is computed as the range after eliminating the highest and lowest 25% of observation in a data set that is arranged in ascending or descending.<br />
Example of Inter-quartile<br />Calculate<br />12, 14, 11, 18, 10.5, 12, 14, 11, 9<br />Arrange in ascending order<br />9, 10.5, 11, 11, 12, 12, 14, 14, 18<br />Ignore first and last two observation. <br />The remaining 11,11,12,12,14 calculate range i.e 14-11=3<br />** The range for this problem is 18-10.5=7.5. Inter-quartile range is 3 is much smaller than the range 7.5 thus proving the point that is less sensitive to extreme value. <br />
MAD and SD Mean Absolute Deviation and Standard deviation<br />MAD-The average based on the deviations measured from arithmetic mean in which all deviation are treated positive.<br />SD- It is classic measure of dispersion. Its based on all observation. Plays vital role in testing hypothesis and forming confidence level. Positive square root of variance.<br />Variance is average sum of square<br />of each item from the mean in a data set.<br />
Normal Distribution<br /><ul><li>It is unimodal and symmetrical. The mode, median and mean are all just in the middle.
Has two parameter mean and sd. If the tails of normal distribution are extended, they will run parallel to x-axis.
The probability density function of the normal distribution is given below</li></li></ul><li>Symmetry property is beauty of Normal distribution. The area under the normal curve has got three distinct positions. <br />The picture depicts area covered within 1,2 sd (95% mean),3 sd (99% mean)<br />
Thank you<br />For any query pl contact<br />firstname.lastname@example.org<br />Principal Trainer – Soft skills and Quality<br />