Upcoming SlideShare
×

# Module4

2,469 views

Published on

2 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

Views
Total views
2,469
On SlideShare
0
From Embeds
0
Number of Embeds
85
Actions
Shares
0
187
0
Likes
2
Embeds 0
No embeds

No notes for slide

### Module4

1. 1. Module 4 Measures of Central Tendency and Dispersion
2. 2. <ul><li>Measures of Central Tendency </li></ul><ul><li>-- Mean </li></ul><ul><ul><ul><li>      Arithmetic </li></ul></ul></ul><ul><ul><ul><li>Geometric </li></ul></ul></ul><ul><ul><ul><li>Harmonic </li></ul></ul></ul><ul><ul><ul><li>Weighted Mean </li></ul></ul></ul><ul><li>Median, Quartiles, Percentiles, Deciles </li></ul><ul><li>Mode </li></ul><ul><li>Measures of Variation </li></ul>Measures Of Central Tendency And Dispersion
3. 3. <ul><li>Range </li></ul><ul><li>Mean Deviation </li></ul><ul><li>Standard Deviation ( Variance ) </li></ul><ul><li>Inter Quartile Range </li></ul><ul><li>Coefficient of Variation </li></ul><ul><li>Measures of Skewness and Kurtosis </li></ul><ul><li>Standardised Variables and Scores </li></ul>Measures Of Central Tendency And Dispersion
4. 4. Measures of Location or Central Tendency <ul><li>Measure of Location </li></ul><ul><li>Centre of Gravity  </li></ul><ul><li>There are three such measures: </li></ul><ul><li>Mean </li></ul><ul><li>Median, Quartiles, Percentiles and Deciles </li></ul><ul><li>Mode </li></ul>
5. 5. Properties of a Measure <ul><li>It should be easy to understand and calculate </li></ul><ul><li>It should be based on all observations </li></ul><ul><li>It should not be much affected by a few extreme observations </li></ul><ul><li>It should be amenable to mathematical treatment. For example, we should be able to </li></ul><ul><li>calculate the combined measure for two sets of observations given the measure for each of the two sets </li></ul>
6. 6. Mean <ul><li>There are three types of means viz., </li></ul><ul><li>Arithmetic Mean </li></ul><ul><li>Harmonic Mean </li></ul><ul><li>Geometric Mean </li></ul>
7. 7. Arithmetic Mean Ungrouped (Raw) Data ns Observatio of Number ns Observatio of Sum  x n xi  
8. 8. Illustration 4.1 Table 4.1 : Equity Holdings of 20 Indian Billionaires ( Rs. in Millions) 2717 2796 3098 3144 3527 3534 3862 4186 4310 4506 4745 4784 4923 5034 5071 5424 5561 6505 6707 6874
9. 9. Illustration 4.1 For the above data, the A.M. is   2717 + 2796 +…… 4645+….. + 5424 + ….+ 6874 = -------------------------------------------------------------------------- 20   = Rs. 4565.4 Millions x
10. 10. Arithmetic Mean Grouped Data    i i f f i x x
11. 11. Illustration 4.2 The calculation is illustrated with the data relating to equity holdings of the group of 20 billionaires given in Table 3.1 Class Interval ( 1 ) Frequency ( f i ) ( 2 ) Mid Value of Class Interval ( x i ) ( 3 ) f i x i Col.(4) = Col.(2) x Col.(3) 2000 – 3000 2 2500 5000 3000 – 4000 5 3500 17500 4000 – 5000 6 4500 27000 5000 – 6000 4 5500 22000 6000 – 7000 3 6500 19500         Sum  f i = 20    f i x i = 91000
12. 12. Illustration 4.2 values of  f i and  f i x i , in formula = 9100 ÷ 20 = 4550    i i f f i x x
13. 13. Weighted Arithmetic Mean if the values x 1 , x 2 x 3 , …. x i , ….x n have weights w 1 , w 2 w 3 , …. w i , ….,w n then the weighted mean of x is given as    i i i w x w x
14. 14. Illustration 4.3 Item Monthly Consumption   Weight (w i ) Rise in Price (Percentage) (p i )   w i p i Sugar 5 5 20 100 Rice 20 20 10 200
15. 15. Illustration 4.3 Therefore, the average price rise could be evaluated as   = = = = = 12. Thus the average price rise is 12 % . 20 5 200 100   25 300   i i i w p w p
16. 16. Geometric Mean The Geometric Mean ( G. M.) of a series of observations with x 1 , x 2 , x 3 , ……..,x n is defined as the n th root of the product of these values . Mathematically G.M. = { ( x 1 )( x 2 )( x 3 )…………….(x n ) } (1/ n ) It may be noted that the G.M. cannot be defined if any value of x is zero as the whole product of various values becomes zero.
17. 17. Illustration 4.5 For the data with values, 2,4, and 8,   G.M. = (2 x 4 x 8 ) (1/3)   = (64) 1/3 = 4
18. 18. Average Rate of Growth of Production/Business or Increase in Prices If P 1 is the production in the first year and P n is the production in the nth year, then the average rate of growth is given by ( G – 100) % where, G = 100 (P n / P 1 ) 1/(n-1)   or log G = log 100 + { 1/(n–1) } (log P n – log P 1 )
19. 19. Example 4.4 The wholesale price index in the year 2000-01 was 145.3. It increased to 195.5 in the year 2005-06. What has been the average rate of increase in the index during the last 5 years.   Solution: By using the formula ( 4.8), we have log G = 2 +{ (1/5) ( log 195.5 – log145.3 ) } = 2.02578 Therefore, G = Anti log (2.02578) = 106.11 Thus the average rate of increase = 106.11  100 = 6.11%
20. 20. Combined G.M. of Two Sets of Data   If G 1 & G 2 are the Geometric means of two sets of data, then the combined Geometric mean, say G, of the combined data is given by : n 1 log G 1 + n 2 log G 2 log G = ------------------------------- n 1 + n 2
21. 21. Combined G.M. of Two Sets of Data <ul><li>As another example, suppose the average growth rate during the first five years of business is 20 %, and the average growth rate of business during the next five years is 15 %, and we wish to find the average growth rate for the entire period of 10 years. This growth rate can be found by calculating the combined geometric mean of the geometric means 120 and 115, for the two blocks of 5-year periods. Thus, the requisite G.M., say G, can be worked out as follows: </li></ul>
22. 22. Combined G.M. of Two Sets of Data 5 log 120 + 5 log 115 5 x 2.07918 + 5 x 2.06070 log G = ------------------------------- = ---------------------------------- 5 + 5 10 20.6994 = ------------ = 2.06994 10 Therefore,   G = antilog 2.06994 = 117.47 Thus the combined average rate of growth for the period of 10 years is 17.47%.
23. 23. Weighted Geometric Mean Just like weighted arithmetic mean, we also have weighted Geometric mean If x 1 , x 2 ,….,x i, ….,x n are n observations with weights w 1 , w 2 , …w i ,.., w n , then their G.M. is defined as:    w i log x i G.M. = ----------------------  w i
24. 24. Harmonic Mean The harmonic mean (H.M.) is defined as the reciprocal of the arithmetic mean of the reciprocals of the observations.  For example, if x 1 and x 2 are two observations, then the arithmetic means of their reciprocals viz 1/x 1 and 1/ x 2 is   {(1 / x 1 ) + (1 / x 2 )} / 2 = (x 2 + x 1 ) / 2 x 1 x 2 The reciprocal of this arithmetic mean is 2 x 1 x 2 / (x 2 + x 1 ). This is called the harmonic mean.   Thus the harmonic mean of two observations x 1 and x 2 is 2 x 1 x 2 ----------------- x 1 + x 2
25. 25. Relationship Among A.M. G.M. and H.M. The relationships among the magnitudes of the three types of Means calculated from the same data are as follows:   (i) H.M. ≤ G.M. ≤ A.M.   i.e. the arithmetic mean is greater than or equal to the geometric which is greater than or equal to the harmonic mean. ( ii ) G.M. = i.e. geometric mean is the square root of the product of arithmetic mean and harmonic mean. ( iii) H.M. = ( G.M.) 2 / A .M. . . . M H M A 
26. 26. Median <ul><li>whenever there are some extreme values in the data, calculation of A.M. is not desirable. </li></ul><ul><li>Further, whenever, exact values of some observations are not available, A.M. cannot be calculated. </li></ul><ul><li>In both the situations, another measure of location called Median is used. </li></ul>
27. 27. Median - Ungrouped Data First the data is arranged in ascending/descending order.   In the earlier example relating to equity holdings data of 20 billionaires given in Table 4.1, the data is arranged as per ascending order as follows   2717 2796 3098 3144 3527 3534 3862 4187 4310 4506 4745 4784 4923 5034 5071 5424 5561 6505 6707 6874 Here, the number of observations is 20, and therefore there is no middle observation. However, the two middle most observations are 10 th and 11 th . The values are 4506 and 4745. Therefore, the median is their average.   4506 + 4745 9251 Median = ----------------- = ----------- 2 2   = 4625.5   Thus, the median equity holdings of the 20 billionaires is Rs.4625.5 Millions.
28. 28. Median - Grouped <ul><li>The median for the grouped data is also defined as the value corresponding to the ( (n+1)/2 ) th observation, and is calculated from the following formula: </li></ul><ul><li> ( (n/2) –f c ) </li></ul><ul><li>Median = L m + -----------------  w m </li></ul><ul><li>f m </li></ul><ul><li>  where, </li></ul><ul><li>L m is the lower limit of 'the median class internal i.e. the interval which contains n/2 th observation </li></ul><ul><li>f m is the frequency of the median class interval i.e. the class interval which contains the ( (n)/2 ) th observation </li></ul><ul><li>f c is the cumulative frequency up to the median class- interval </li></ul><ul><li>w m is the width of the median class-interval </li></ul><ul><li>n is the number of total observations. </li></ul>
29. 29. Illustration 4.2 Class Interval Frequency Cumulative frequency 2000-3000 2 2 3000-4000 5 7 4000-5000 6 13 5000-6000 4 17 6000-70000 3 20
30. 30. Illustration 4.2 Here, n = 20, the median class interval is from 4000 to 5000 as the 10 th observation lies in this interval. Further,  L m = 4000   f m = 6   f c = 7   w m = 1000 Therefore, 20/2 –7 x 1000 Median = 4000 + ------------------------- 6 = 4000 + 3/6 x 1000 = 4000 + 500 = 4500
31. 31. Median <ul><li>The median divides the data into two parts such that the number of observations less than the median are equal to the number of observations more than it. </li></ul><ul><li>This property makes median very useful measure when the data is skewed like income distribution among persons/households, marks obtained in competitive examinations like that for admission to Engineering / Medical Colleges, etc. </li></ul>
32. 32. Graphical Method of Finding the Median <ul><li>If we draw both the ogives viz. “Less Than “ and “ More Than”, for a data, then the point of intersection of the two ogives is the Median. </li></ul>
33. 33. Quartiles <ul><li>Median divides the data into two parts such that 50 % of the observations are less than it and 50 % are more than it. Similarly, there are “Quartiles”. There are three Quartiles viz. Q 1 , Q 2 and Q 3 . These are referred to as first, second and third quartiles. </li></ul><ul><li>The first quartile , Q 1 , divides the data into two parts such that 25 % ( Quarter ) of the observations are less than it and 75 % more than it. </li></ul><ul><li>The second quartile, Q 2 , is the same as median . The third quartile divides the data into two parts such that 75 % observations are less than it and 25 % are more than it. </li></ul><ul><li>All these can be determined, graphically, with the help of the Ogive curve </li></ul>
34. 34. Quartiles
35. 35. Quartiles data Q 1 and Q 3 are defined as values corresponding to an observation given below :   Ungrouped Data Grouped Data (arranged in ascending or descending order)   Lower Quartile Q 1 {( n + 1 ) / 4 } th ( n / 4 ) th      Median Q 2 { ( n + 1 ) / 2 } th ( n / 2 ) th    Upper Quartile Q 3 {3 ( n + 1 ) / 4 } th (3 n / 4 ) th
36. 36. Quartiles 1 1 1 ) 4 / ( 1 Q Q c Q w f f n L Q     3 3 3 ) 4 / 3 ( 3 Q Q c Q w f f n L Q    
37. 37. Equity Holding Data Class Interval Frequency Cumulative frequency 2000-3000 2 2 3000-4000 5 7 4000-5000 6 13 5000-6000 4 17 6000-70000 3 20
38. 38. ( (20/4) – 2 ) Q 1 = 3000 + ---------------  1000 5   ( 5 – 2 ) = 3000 + --------------------  1000 5   3000 = 3000 + ------------- 5   = 3000 + 600   = 3600   The interpretation of this value of Q 1 is that 25 % billionaires have equity holdings less than Rs.
39. 39.   (15 – 13) Q 3 = -------------  1000 +5000 4   2 = -------  1000 +5000 4   = 5500 The interpretation of this value of Q 3 is that 75 % billionaires have equity holdings less than Rs. 5500 Millions.
40. 40. Percentiles (95/100)  n – f c P 95 = L P95 + ------------------- x w P95 f P95 where, L P95 is the lower point of the class interval containing 95 th percent of total frequency, f c is the cumulative frequency up to the 95 th percentile interval, f P95 is the frequency of the 95 th percentile interval and w P95 is the width of the 95 th percentile interval.
41. 41. Deciles <ul><li>Just like quartiles divide the data in four parts, the deciles divide the data into ten parts – first deciles ( 10% ) , second ( 20% ) , and so on. In fact, P 10 , P 20 , ……………….., P 90 are the same as deciles. And just as second quartile and median are the same, so the fifth decile i.e. P 50 and the median are the same . </li></ul>
42. 42. Mode   f m - f 0 Mode = L m + -----------------  w m f m - f 0 - f 2 <ul><li>where , </li></ul><ul><li>L m is the lower point of the modal class interval </li></ul><ul><li>f m is the frequency of the modal class interval </li></ul><ul><li>f 0 is the frequency of the interval just before the modal interval </li></ul><ul><li>f 2 is the frequency of the interval just after the modal interval </li></ul><ul><li>w m is the width of the modal class interval </li></ul>
43. 43. Equity Holding Data <ul><li>the modal interval i.e., the class interval with the maximum frequency (6) is 4000 to 5000. Further, </li></ul><ul><li>L m = 4000 </li></ul><ul><li>w m = 1000 </li></ul><ul><li>f m = 6 </li></ul><ul><li>f 0 = 5 </li></ul><ul><li>f 2 = 4 </li></ul><ul><li>Therefore </li></ul>
44. 44. Equity Holding Data <ul><li> ( 6 – 5 ) </li></ul><ul><li>Mode = 4000 + --------------------  1000 </li></ul><ul><li> 2  6 – 5 – 4 </li></ul><ul><li>= 4000 +  1000 </li></ul><ul><li>= 4000 + 333.3 </li></ul><ul><li>= 4333.3 </li></ul><ul><li>Thus the modal equity holdings of the billionaires is Rs. 4333.3 Millions. </li></ul>
45. 45. Empirical Relationship among Mean, Median and Mode <ul><li>In a moderately skewed distributions, it is found that the following relationship, generally, holds good : </li></ul><ul><li>Mean – Mode = 3 (Mean – Median) </li></ul><ul><li>  </li></ul><ul><li>From the above relationship between, Mean, Median and Mode, if the values of two of these are given, the value of third measure can be found out </li></ul>
46. 46. Equity Holding Data 4333 4500 4565 (mode) (median) (mean)
47. 47. Right Skewed Distribution Mode Median Mean
48. 48. Symmetrical Mode Median Mean
49. 49. Left Skewed Distribution Mean Median Mode
50. 50. Features of a Good Statistical Average <ul><li>Readily computable, comprehensible and easily understood </li></ul><ul><li>It should be based on all the observations </li></ul><ul><li>It should be reliable. enough to be taken as true representative of the population </li></ul><ul><li>It should not be much affected by the extreme values in the data </li></ul><ul><li>It should be amenable to further mathematical treatment. This properly helps in assessing the reliability of conclusions drawn about the population value with the help of sample value </li></ul><ul><li>Should not vary much from sample to sample taken from the same population. </li></ul>
51. 51. Comparison of Measures of Location Arithmetic Mean Advantages Disadvantages <ul><li>Easy to understand and </li></ul><ul><li>calculate </li></ul><ul><li>(ii) Makes use of full data </li></ul><ul><li>(iii) Only number and sum of the </li></ul><ul><li>observations need be known </li></ul><ul><li>for its calculation. </li></ul>  (i ) Unduly influenced by extreme values (ii) Cannot be calculated from the data with open-end class- intervals in grouped data or when values of all observations are available – all that is known that some observations are either less than or greater than some value, in ungrouped data
52. 52. Geometric Mean Advantages Disadvantages (i) Makes use of full data   (ii) Extreme large values have lesser impacts (ii) Useful for data relating to rations and percentage (iv) Useful for rate of change/growth   (i)       Cannot be calculated if any observation has the value zero (ii) Difficult to calculate and interpret
53. 53. Median Advantages Disadvantages (i) Simple to understand (ii) Extreme values do not have any impact (iii) Can be calculated even if values of all observations are not known or data has open-end class intervals (iv) Used for measuring qualities and factors which are not quantifiable (v) Can be approximately determined with the help of a graph (ogives) <ul><li>  </li></ul><ul><li>(i) Arranging values in ascending /descending order may sometime be tedious </li></ul><ul><li>(ii) Sum of the observations cannot be found out, if only Median is known </li></ul><ul><li>Not amenable for mathematical calculations </li></ul>