Statistik topic2 tabular presentation


Published on

Published in: Education, Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Statistik topic2 tabular presentation

  1. 1. Topic 2 Tabular Presentation LEARNING OUTCOMES By the end of this topic, you should be able: 1. develop frequency distribution table; 2. formulate relative frequency distribution table; 3. prepare cumulative frequency distribution table. INTRODUCTIONYou have been introduced to various types of data in Topic 1. In this topic, wewill learn how to present data in tabular form to help us to make a further study onthe property of data distribution namely Frequency Distribution Table, RelativeFrequency Distribution and Cumulative Frequency Distribution. This tabularpresentation is suitable for all types of data. The tabular form is much easier tounderstand and for qualitative variable, one can make a quick comparisonbetween categorical values. Another advantage is that the information lost duringthe tabular formation can be reduced. What is the difference between Frequency Distribution Table, Relative Frequency Distribution and Cumulative Frequency Distribution? Think about it.
  2. 2. 12 TOPIC 2 TABULAR PRESENTATION 2.1 FREQUENCY DISTRIBUTION TABLETable 2.1 below is an example of Frequency Distribution Table of qualitativevariable (ethnicity). The first row shows the categorical values of the variable andthe second row is the frequency of each categorical value. The second row tells ushow a total of 550 students are distributed with respect to the respectedcategorical value. We can see that 245 students are Malays, 182 students areChinese and so on. Table 2.1: Frequency Distribution of students by Ethnicity in School J. Ethnic Malay Chinese Indian Others Total Background (x)Frequency (f) 245 182 84 39 550Quantitative data involving large numbers may be divided into several non-overlapping classes or intervals. The frequency of each class will be developed bycounting the data falling in each respective class.Table 2.2 shows Frequency Distribution of monthly family income of student’s atSchool J. The first row shows the group classes of the income, and the secondrow is the frequency (the number of students) whose monthly family income fallsfor each respective class of each categorical value. The second row again tells ushow the 550 students are distributed into the respective classes. There are 98 ofthe 550 students whose families have monthly income between RM0 – 1,000.There are 152 families having income in the interval 1,001-2,000 etc. Table 2.2: Frequency Distribution of Family Income of Students at School J. Monthly 0 - 1000 1001 - 2001 - 3001 - 4001 - Total Income (RM x) 2000 3000 4000 5000 Frequency (f) 98 152 100 180 20 550(a) Developing Frequency Distribution Table of Quantitative Data Let us again examine Table 2.2. Each class consists of lower limit and upper limit separated by a hyphen ‘-’. For example, the second class has a lower limit RM1001, and upper limit RM2000, where as the fifth class has a lower limit RM4001 and upper limit RM5000. By looking at the upper limit of a class and the lower limit of its following class, it is clear that there are no two adjacent classes overlapping each other.
  3. 3. TOPIC 2 TABULAR PRESENTATION 13This property is very important in developing a frequency table, to avoiddouble counting of any data when obtaining the frequency of each class.Another property is any two adjacent classes are separated by a middle pointcalled class boundary. Thus, each class will also have a lower and an upperboundary. Let us now develop Frequency Distribution Table of books soldweekly by a book store given in Table 2.3 below.As you can see, it is discrete data, find the reason, why?Table 2.3: The Number of Books Sold Weekly for 50 Weeks by a Book Store35 75 65 62 68 55 66 60 62 8065 70 66 60 72 95 85 66 70 6865 62 78 80 47 70 68 90 40 7270 50 70 72 55 55 60 56 48 7574 62 45 52 55 68 82 80 75 75(i) The Number of Classes The followings are some guides to determine the number of classes: the total number of classes in a distribution table should not be too little or too large or otherwise it will distort the original shape of data distribution. Usually one can choose any number between 5 classes to 15 classes. depending on the size of the data, sometimes the distribution becomes too flat if one chooses more than 15 classes, or become too peak if we choose less than 5 classes. however, the following empirical formula can be used to determine the approximate number of classes (K) for a given n number of observations. K 1 3.3 log(n) (2.1) For the books on weekly sale, we have
  4. 4. 14 TOPIC 2 TABULAR PRESENTATION K 1 + 3.3 log (50) = 6.6 as it is an approximation, one can choose any close integer to the above value. In this example we would choose integer 6 as the approximate number of classes. (ii) Class Width and Class Limits Class width can differ from one class to another. Usually, the same class width for all classes is recommended when developing frequency distribution table. The following empirical formula can be used to determine the approximate class width; Data Range Class Width (2.2) Number of class(K ) (iii) Data Range Sometimes being called range is the difference between the largest and smallest observation values. For the books on weekly sales, the class width will be; l arg est number smallest number 95 35 ClassWidth 10 books K 6 Again here, we make another approximation. Since the data is discrete, it is wise to choose a round figure fairly close to the approximate value (if necessary). For the above data, we choose 10 books as the class width or sometimes being called class interval. (iv) Limits of Each Class The simple rules below are noted when one seek class limits for each class interval:
  5. 5. TOPIC 2 TABULAR PRESENTATION 15 identify the smallest as well as the largest data. all data must be enclosed between the lower limit of the first class and the upper limit of the final class. the smallest data should be within the first class. Thus the lower limit of the first class can be any number less than or equal the smallest data. in the case of the same class width for all classes, the lower limit of a current class is equal to the lower limit of its previous class plus class width. We can proceed this way to build up the entire classes until all data are counted. tallying process is normally used to count data that falls in each class, this count become the frequency of each class. For the data books on weekly sales, let 34 be the lower limit of the first class, then the lower limit of the second class is 44 (i.e. 34 + 10, the lower limit of the first class is incremented by class width to obtain the lower limit of the second class); and the lower limit of the third class will be 54 and so on until we get the lower limit of the final class as 94 (i.e. 84+10). On the other hand, the upper limit of the first class is 43 (just 1 unit less than lower limit of the second class). We can build the upper limits of all classes in the same manner. Eventually, we will have the classes as: 34-43, 44-53, 54-63, 64-73, 74-83, 84-93, and 94-103. We notice that the actual number of classes developed is 7 which is greater than the round up integer of the original calculated value K. One should not worry much about this as it always happens due to the rounding errors.(v) Frequency of Each Class The following process is recommended to determine the frequency of each class: the tally counting method is the easiest way to determine the frequency of each class from the given set of data. begins with the first number in the data set, search which class the number will fall, then strike “1 vertical bar or stroke” for that particular class. If the second number would fall into the same class, then we have the second stroke for that class, and so on.
  6. 6. 16 TOPIC 2 TABULAR PRESENTATION once we have four strokes for a class, the fifth stroke will be used as a back-stroke to tie up the immediate first four strokes and make one ‘bundle’. So one ‘bundle’ will comprise of 5 strokes altogether. the process of searching class for each data is continued until we cover all data. as one stroke to represent one data, therefore a bundle will represent 5 data fall into the class. by counting the bundles will make the counting process much easier. There may be several ‘bundles’ and or strokes for a class. the total number of strokes will be the frequency for that class. the total frequency for all classes will then be equal to the total number of data in the sample. the counting process for books on weekly sales is given in Table 2.4 below: Table 2.4: Frequency Distribution of Books on Weekly Sales Class Counting Tally Frequency (f) 34 - 43 ll 2 44 - 53 llll 5 54 - 63 llll llll ll 12 64 - 73 llll llll llll lll 18 74 - 83 llll llll 10 84 - 93 ll 2 94 - 103 l 1 Sum f = 50 (vi) Class Boundaries and Class Mid-points Any two adjacent classes are separated by a middle point called class boundary. It is a mid-point between the lower limit of a class and the upper limit of its previous class. This separation will ensure the non-overlapping between any two adjacent classes. Thus, each class will have a lower boundary and an upper boundary.
  7. 7. TOPIC 2 TABULAR PRESENTATION 17 The lower boundary of a given class is actually the upper boundary of its previous class as demonstrated by Figure 2.1, (see next page). Class boundaries can be obtained as follows: upper limit lowerlimit previous class that class Lower boundary of a class 2 upper limit lower limit of that class of next class Upper boundary of a class 2 Class mid-point is located at the middle of each class and is obtained by: lower boundary upper boundary of the class of that class Class mid - po int 2 Class mid-point will become very important number as it represents all data that fall in that particular class irrespective of their actual raw values. By virtue of its roles, as for the data books on daily loan, we are actually reducing the data sizes to the number of K class mid- points. These K class mid-points then will be used in further calculation of descriptive statistics such as mean, mode, median etc. of the data distribution. Figure 2.1: The property of any classTable 2.5 below shows the properties of classes of the frequency table.
  8. 8. 18 TOPIC 2 TABULAR PRESENTATION Table 2.5: The Lower Class-boundary, Class Mid-point and Upper Class-boundary of the Frequency Table of Books Class Lower Class Mid-point Upper Frequency (f) Boundary (x) Boundary 34 - 43 33.5 38.5 43.5 2 44 - 53 43.5 48.5 53.5 5 54 - 63 53.5 58.5 63.5 12 64 - 73 63.5 68.5 73.5 18 74 - 83 73.5 78.5 83.5 10 84 - 93 83.5 88.5 93.5 2 94 - 103 93.5 98.5 103.5 1 f = 50(b) The Actual Frequency Table The actual frequency table is the one without the column of tally counting, as follows: Table 2.6: Frequency Distribution Table on Weekly Book Sales Class 34 - 43 44 - 53 54 - 63 64 - 73 74 - 83 84 - 93 94 - 103 Frequency (f) 2 5 12 18 10 2 1 Data set comprise of non-repeating individual number or observation that can be grouped into several classes before developing frequency table. Do you agree with this idea? Give your opinion.You should attempt the following exercises to test your understanding on thediscussed concepts.
  9. 9. TOPIC 2 TABULAR PRESENTATION 19 ACTIVITY 2.1 1. The following are the marks of the Statistics subject obtained by 40 students in a final examination. Develop a frequency table, use 4 as lower limit of the first class. 60 20 10 25 5 35 30 65 15 40 45 5 30 55 60 45 50 8 10 40 20 30 34 4 25 56 48 9 16 44 70 24 7 9 36 30 30 40 65 50 (a) State the lower and upper limits and its frequency of the second class. (b) Obtain the lower and upper boundaries, and class mid-point of the fifth class.2.2 RELATIVE FREQUENCY DISTRIBUTION Relative frequency of a class is just the ratio of its frequency to the total frequency. Each relative frequency has value between 0 and 1, and the total of all relative frequencies would then be equal to 1.Some times relative frequency can be expressed in percentage by multiplying100% to each relative frequency. Thus, we will have the total of 100%. Byreferring to Table 2.6, the Relative Frequency distribution for the books on dailyloan can be developed. This is given in Table 2.7 below.As per our observation from Table 2.7, one can easily tell the proportion orpercentage of all data that fall in a particular class. For example, there is about0.04 or 4% of the data are between 34and 43 books on weekly sales. By doingsome additions, we can also tell that about 0.80 or 80% (i.e. 24%+36%+20%) ofthe data are between 54 and 83 books, and it is only 6% above 83 books onweekly sales.
  10. 10. 20 TOPIC 2 TABULAR PRESENTATION Table 2.7: Relative Frequency Distribution for the Books on Weekly Sales Class 34 - 43 44 - 53 54 - 63 64 - 73 74 - 83 84 - 93 94 - 103 SumFrequency 2 5 12 18 10 2 1 50 (f) Relative 0.04 0.1 0.24 0.36 0.20 0.04 0.02 1.00Frequency Relative 4 10 24 36 20 4 2 100Frequency (%) 2.3 CUMULATIVE FREQUENCY DISTRIBUTIONThe total frequency of all values less than the upper class boundary of a givenclass is called a cumulative frequency up to and including the upper limit of thatclass. For example, the cumulative frequency up to and including the class 54-63in Table 2.7 is 2+5+12 = 19, signifying that by 19 weeks, 63 books were soldhaving books on sales less than 63.5 books. A table presenting such cumulativefrequencies is called a cumulative frequency distribution table, or cumulativefrequency table, or briefly a cumulative distribution. There are two types ofcumulative distributions:(a) Cumulative distribution “Less-than or Equal”, using upper boundaries as partition;(b) Cumulative distribution “More-than”, using lower boundaries as partition.In this course we will only concentrate on the first type.Table 2.8 presents the cumulative distribution of the type “Less-than or Equal” forthe books on weekly sales. For this type, we need to add a class with ‘zerofrequency’ prior to the first class of Table 2.6, and use its upper boundary as 33.5books.
  11. 11. TOPIC 2 TABULAR PRESENTATION 21Table 2.8: Developing Cumulative Distribution Type “Less-than or Equal” for the Books on Weekly Sales Class Frequency Upper Cumulating Cumulative (f) Boundary Process Frequency24 – 33 0 33.5 0 034 - 43 2 43.5 0+2 244 - 53 5 53.5 2+5 754 - 63 12 63.5 7 + 12 1964 - 73 18 73.5 19 + 18 3774 - 83 10 83.5 37 + 10 4784 - 93 2 93.5 47 + 2 49 94 - 1 103.5 49 + 1 50 103 Sum f = 50The actual cumulative distribution table is given in Table 2.9 below. The columnfor cumulative frequency in percentage (%) is optional. Table 2.9: The “Less-than or Equal” Cumulative Distribution for the Books on Weekly Sales Upper Boundary Cumulative Frequency Cumulative Frequency (%) 33.5 0 0 43.5 2 4 53.5 7 14 63.5 19 38 73.5 37 74 83.5 47 94 93.5 49 98 103.5 50 100Do attempt the following exercises to test your understanding.
  12. 12. 22 TOPIC 2 TABULAR PRESENTATION ACTIVITY 2.2 1. The following questions are based on the given frequency table: Marks 10 - 19 20 - 29 30 - 39 40 - 49 50 - 59 Number of students (f) 10 25 35 20 10 (a) give the number of students that acquired not more than 29 marks. (b) give the number of students that acquired 30 or more marks. 2. Refer to the frequency table given in Question 1, (a) obtain the class mid-points of all classes, (b) obtain the table of Relative Frequencies. (c) obtain the Cumulative frequency “less than or equal”. 3. There are 1,000 students staying in university campus. All respondents of a survey research regarding the degree of comfort of a residential area. The following Likert Scale is given to them to gauge their perception: 1 2 3 4 5 Very Comfortable Fairly Un-comfortable Very comfortable comfortable Un-comfortable The research findings shows that: 120 students choose category ‘1’, 180 students choose category ‘2’, 360 students choose category ‘3’, 240 students choose category ‘4’ and 100 students choose category ‘5’. Display the research findings in the form of frequency table distribution, as well as their relative frequency distribution in terms of proportion and percentages. 4. A teacher wants to know the effectiveness of the new teaching method for mathematics at a primary school. The method has been delivered to a class of 20 pupils. A test is given to the pupils at the end of semester. The test marks are given below: 77 91 62 54 72 66 84 38 76 70 84 59 82 78 74 96 44 76 85 66 Develop a frequency distribution table. Let 35 marks be the lower limit of the first class.
  13. 13. TOPIC 2 TABULAR PRESENTATION 23The frequency distribution table, relative frequency distribution and cumulativedistribution are tabular presentation of the original raw data in a form of a moremeaningful interpretation. The tabular presentation is also very useful when it isneeded to have a graphical presentation later on. Congratulations! You havereached the end of Topic 2. Do you understand the entire materials covered in thistopic? You may take a break for a while before you start your revision.