Like this document? Why not share!

297

Published on

No Downloads

Total Views

297

On Slideshare

0

From Embeds

0

Number of Embeds

0

Shares

0

Downloads

0

Comments

0

Likes

1

No embeds

No notes for slide

- 1. Chapter 3Frequency Distributions and Data AnalysesChapter Outline3.1 Introduction ............................................................................... 653.2 Tally Table for Constructing a Frequency Table ......................................... 663.3 Three Other Frequency Tables ............................................................ 703.4 Graphical Presentation of Frequency Distribution ....................................... 723.5 Further Economic and Business Applications ............................................ 823.6 Summary .................................................................................. 89Questions and Problems ........................................................................ 89Key TermsGrouped data HistogramsRaw (nongrouped) data Stem-and-leaf displaysFrequency Frequency polygonFrequency table Cumulative frequency polygonFrequency distribution Pie chartCumulative frequencies Lorenz curveRelative frequency Gini coefﬁcientCumulative relative frequency Absolute inequality3.1 IntroductionUsing the tabular and graphical methods discussed in Chap. 2, we will now developtwo general ways to describe data more fully. We discuss ﬁrst the tally tableapproach to depicting data frequency distributions and then three other kinds offrequency tables. Next, we explore alternative graphical methods for describingfrequency distributions. Finally, we study further applications for frequencydistributions in business and economics.C.-F. Lee et al., Statistics for Business and Financial Economics,DOI 10.1007/978-1-4614-5897-5_3, # Springer Science+Business Media New York 201365
- 2. 3.2 Tally Table for Constructing a Frequency TableBefore conducting any statistical analysis, we must organize our data sets. One wayto organize data is by using a tally table as a worksheet for setting up a frequencytable. To set up a tally table for a set of data, we split the data into equal-sizedclasses in such a way that each observation ﬁts into one and only one class ofnumbers (i.e., the classes are mutually exclusive). Sometimes data are reported in afrequency table with class intervals given but with actual values of observations inthe classes unknown; data presented in this manner are called grouped data. Theanalyst assigns each data point to a class and enters a tally mark made by that class.Let’s see how this works.Example 3.1 Tallying Scores from a Statistics Exam. Suppose a statistics professorwants to summarize how 20 students performed on an exam. Their scores are asfollows: 78, 56, 91, 59, 78, 84, 65, 97, 84, 71, 84, 44, 69, 90, 73, 77, 80, 90, 68, and75. Data in this form are called nongrouped data or raw data. We can use a tallytable like Table 3.1 to list the number of occurrences, of frequency, of each score.A corresponding diagram is shown in Fig. 3.1.This table presents nongrouped data, and no pattern emerges from them. As analternative, the data can be grouped into classes by letter grade. If the professor usesa straight grading scale, the classes might be 90–99, 80–89, 70–79, 60–69, andbelow 60. After establishing the classes, the professor counts scores in each classand records these numbers to obtain a tally sheet, as shown in Table 3.2 andFig. 3.2.Note that each observation is included in one and only one class. The tallies arecounted, and a frequency table is constructed as shown in Table 3.3, where lettergrades are assigned to each class.Table 3.1 Student examscoresScore Tallies Frequency44 / 156 / 159 / 165 / 168 / 169 / 171 / 173 / 175 / I77 / 178 // 280 / 184 /// 390 // 291 / 197 / 1Total 2066 3 Frequency Distributions and Data Analyses
- 3. Frequency44 56 59 65 68 69 71 73 75 77 78 80 84 90 91 97Score3.02.82.62.42.22.01.81.61.41.21.0.8.6.4.20Fig. 3.1 Bar graph for nongrouped student exam scores given in Table 3.1Table 3.2 Tally table forstatistics exam scoresClass Tally FrequencyBelow 60 /// 360–69 /// 370–79 ////// 680–89 //// 490–99 //// 420Frequency6543210Below 60 60-69 70-79 80-89 90-99ScoreFig. 3.2 Bar graph for grouped student exam scores given in Table 3.23.2 Tally Table for Constructing a Frequency Table 67
- 4. Example 3.2 A Frequency Distribution of Grade Point Averages. Suppose thatthere are 30 students in a classroom and that they have the grade point averageslisted in Table 3.4. A tally table is constructed, in which classes are (arbitrarily)deﬁned at every half-point and each tally marked next to a particular class accountsfor one data entry. The entries are then counted to obtain a frequency distribution,as shown in Table 3.5. A frequency distribution simply shows how manyobservations fall into each class. We will discuss this concept in further detail inthe next section.Generally, a data set should be divided into 5–15 classes. Having too few or toomany classes gives too little information. Imagine a frequency distribution withonly two classes: 0.0–2.0 and 2.1–4.0. With such broadly deﬁned classes, it isdifﬁcult to distinguish among GPAs. Similarly, if the class interval were only one-tenth of a point, the large number of classes, each with only one or a few tallies,would make summarizing the data almost impossible.Table 3.3 Frequency tablefor statistics exam scoresClass Grade FrequencyBelow 60 F 360–69 D 370–79 C 680–89 B 490–99 A 420Table 3.4 Student GPAs:raw data1.2 3.9 1.93.8 2.4 2.72.3 2.3 2.60.7 3.1 3.73.6 2.9 4.02.2 2.7 1.21.9 0.8 1.82.1 0.3 2.43.1 3.2 3.20.8 3.1 3.6Table 3.5 Student GPAs:tally table and frequencydistributionRange Tallies FrequencyBelow 1.5 ////// 61.5–1.9 /// 32.0–2.4 ////// 62.5–2.9 //// 43.0–3.4 ///// 53.5–4.0 ////// 6Total 3068 3 Frequency Distributions and Data Analyses
- 5. In the GPA example, it was relatively easy to construct the classes because GPAcutoffs were used. However, in most examples, there are no natural dividing linesbetween classes. The following guidelines can be used to construct classes:1. Construct from 5 to 15 classes. This step is the most difﬁcult, because using toomany classes defeats the purpose of grouping the data into classes, whereashaving too few classes limits the amount of information obtained from the data.As a general rule, when the range and number of observations are large, moreclasses can be deﬁned. Fewer classes should be constructed when the number ofobservations is only around 20 or 30.2. Make sure each observation falls into only one class. This can often be accom-plished by deﬁning class boundaries in terms of several decimal places. If thepercentage return on stocks is carried to one decimal place, for example, thendeﬁning the classes by using two decimal places will ensure that each observa-tion falls into only one class.3. Try to construct classes with equal class intervals. This may not be possible,however, if there are outlying observations in the data set.Example 3.3 A Frequency Distribution of 3-Month Treasury Bill Rates. Table 3.6presents another example, and here the data presented are the interest rates on3-month treasury bills (T-bills) from 1990 to 2009. (T-bills are debt instrumentssold by the US government to ﬁnance its budgetary needs.) The annual data forinterest rates (average daily rates for a year) are taken from Economic Report of thePresident, January 2009.As we have noted, a frequency distribution gives the total number of occurrencesin each class. In the next chapter, we will talk about using a frequency distributionto present data.By setting up a tally table and a frequency table, we can scrutinize data forerrors. For example, if the data value 123 appears in a column for the rate in theT-bill example, a mistake has clearly been made – one that could be due to amissing decimal point. Probably, the data point could be 12.3 % instead, whichmakes more sense because it is in the range of the other data points. Data shouldalso be checked for accuracy. Otherwise, invalid conclusions could be reached.Table 3.6 T-bill interestrates, 1990–2009Class (%) Tallies Frequency0–1.49 //// 41.50–3.49 ///// 53.50–5.49 ///////// 95.50–6.49 / 16.50 and greater / 1Total 203.2 Tally Table for Constructing a Frequency Table 69
- 6. 3.3 Three Other Frequency TablesIn this section, using the frequency table discussed in the Sect. 3.2, we move aheadto cumulative frequency tables, relative frequency tables, and relative cumulativefrequency tables.Example 3.4 Frequency Distributions for Statistics Exam Scores. Suppose that forthe data listed in Table 3.3, the professor wants to know how many students receivea C or below, the proportion of students who receive a B, and the proportion ofstudents who receive a D or an F. To obtain this information, she calculatescumulative, relative, and cumulative relative frequencies.By constructing cumulative frequencies, the professor determines the number ofstudents who scored in a particular class or in one of the classes before it (Table 3.7).Obviously, the cumulative frequency for the ﬁrst class is the frequency itself (3):there are no classes before it. The cumulative frequency for the second class iscalculated by taking the frequency in the ﬁrst class and adding it to the frequency inthe second class (3) to arrive at a cumulative frequency of 6. This means that 6students were in the ﬁrst two classes. Then 6 is added to the frequency of the thirdclass (6) to derive a cumulative frequency of 12. Thus, 12 students scored a C or aworse grade. The remaining cumulative frequencies are calculated in a similarmanner. Note that the cumulative observation in the last class equals the totalnumber of sample observations, because all frequencies have occurred in thatclass or in previous classes.Another important concept is the relative frequency, which measures the pro-portion of observations in a particular class. It is calculated by dividing thefrequency in that class by the total number of observations. For the datasummarized in Table 3.7, the relative frequency for both the ﬁrst and second classesis 0.15, and the relative frequencies for the remaining three classes are 0.30, 0.20,and 0.20, respectively, as shown in Table 3.8. The sum of the relative frequenciesalways equals 1.This table indicates that 15 % of the class received an F, 15 % a D, 30 % a C, andso on. The professor can calculate the cumulative relative frequency for any classby adding the appropriate relative frequencies. Cumulative relative frequencymeasures the percentage of observations in a particular class and all previousclasses. Thus, if she wants to determine what percentage of the students scoredbelow a B, our conscientious professor can add the relative frequencies associatedwith grades C, D, and F to arrive at 60 %.Table 3.7 Cumulative frequency table for grade distributionClass Grade Frequency Cumulative frequencyBelow 60 F 3 360–69 D 3 670–79 C 6 1280–89 B 4 1690–99 A 4 2070 3 Frequency Distributions and Data Analyses
- 7. Example 3.5 Frequency Distributions of Current Ratios for JNJ and MRK. Thecurrent ratios for JNJ and MRK from 1990 to 2009 are shown in Table 3.9.A frequency distribution for the current ratios of Johnson and Johnson and Merckis shown in Table 3.10. This ratio is a measure of liquidity, which (as we noted inChap. 2) indicates how quickly a ﬁrm can obtain cash for operations. The ﬁrstcolumn deﬁnes the classes. Note that the use here of class boundaries ensures thateach observation will fall into only one class.The next column shows the frequency – that is, the number of times that anobservation appears in each class. In Table 3.10, we see that JNJ experienced onecurrent ratio between 1.0 and 1.2, seven between 1.201 and 1.700, and so on. Thethird column presents the cumulative frequency. Because there are 20 observationsin the population, the cumulative frequency for the last class is 20.The fourth column presents the relative frequency, which measures the percentageof observations in each class. Relative frequencies can be thought of as probabilities.For example, the probability that an observation is in the ﬁrst class is 0.1.Table 3.9 Current ratio forJNJ and MRKYear JNJ MRK1990 1.778 1.3321991 1.835 1.5321992 1.582 1.2161993 1.624 0.9731994 1.566 1.2701995 1.809 1.5151996 1.807 1.6001997 1.999 1.4751998 1.364 1.6851999 1.771 1.2852000 2.164 1.3752001 2.296 1.1232002 1.683 1.1992003 1.710 1.2052004 1.962 1.1472005 2.485 1.5822006 1.199 1.1972007 1.510 1.2272008 1.649 1.3482009 1.820 1.805Table 3.8 Relative frequency table for grade distributionClass Grade Relative frequency Cumulative relative frequencyBelow 60 F 0.15 0.1560–69 D 0.15 0.3070–79 C 0.30 0.6080–89 B 0.20 0.8090–99 A 0.20 1.003.3 Three Other Frequency Tables 71
- 8. The last column indicates the cumulative relative frequency, which measures thepercentage of observations in a particular class and all previous classes. Thecumulative relative frequency for Merck’s fourth class is calculated by addingthe relative frequencies of the ﬁrst four classes to arrive at 0.95. That is, 95 % ofthe observations occur in the ﬁrst four classes. The cumulative relative frequencyof the last class always equals 1, because the last class includes all the observations.3.4 Graphical Presentation of Frequency DistributionWe have spoken before of the special effectiveness of using graphs to present data.In this section, we discuss four different graphical approaches to presenting fre-quency distributions.3.4.1 HistogramsFrequency distributions can be represented on a variety of graphs. The histogram,which is one of the most commonly used types, is similar to the bar charts discussedin Chap. 2 except that1. Neighboring bars touch each other.2. The area inside any bar (its height times its width) is proportional to the numberof observations in the corresponding class.To illustrate these two points, suppose the age distribution of personnel at asmall business is as shown in Table 3.11.Table 3.10 Frequency distributions of current ratios for JNJ and MRKClass FrequencyCumulativefrequencyRelativefrequencyCumulative relativefrequencyJNJ1.00–1.2 1 1 0.05 0.051.21–1.4 1 2 0.05 0.11.41–1.60 3 5 0.15 0.251.61–1.80 6 11 0.3 0.551.81–2.00 6 17 0.3 0.852.01–2.5 3 20 0.15 1.00Total 20 1.00MRK0.81–1.00 1 1 0.05 0.051.01–1.2 4 5 0.2 0.251.21–1.4 8 13 0.6 0.651.41–1.60 5 18 0.25 0.91.61–1.80 1 19 0.05 0.951.81–2.00 1 20 0.05 1.00Total 20 1.0072 3 Frequency Distributions and Data Analyses
- 9. To construct a histogram, we need to enter a scale on the horizontal axis.Because the data are discrete, there is a gap between the class intervals, say between20 and 29 and 30–39. In such a case, we will use the midpoint between the end ofone class and the beginning of the next as our dividing point. Between the 20–29interval and 30–39 interval, the dividing point will be (29 þ 30)/2 ¼ 29.5. We ﬁndthe dividing point between the remaining classes similarly.To satisfy the second condition, we note that all ﬁve classes have an intervalwidth of 10 years. Figure 3.3 is the histogram that reﬂects these data.Drawn from the data of Table 3.10, Fig. 3.4a, b are histograms of JNJ’s andMRK’s current ratios for the years 1990–2009.The x-axis indicates the classes andthe y-axis the frequencies. As the histograms show, MRK’s current ratios havetended to fall in the 1.0–1.4 range, whereas those of JNJ show no exact pattern, butmany can be found in the 1.61–2.00 range. (In Chap. 4, we will cover measures ofskewness, which give us more insight into the shape of a distribution.)Table 3.11 Age distributionof personnelClass Frequency20–29 330–39 640–49 750–59 460–69 170–79 1Frequency7654321019.5 29.5 39.5 49.5 59.5 69.5 79.5AgeFig. 3.3 Histogram of age distribution given in Table 3.113.4 Graphical Presentation of Frequency Distribution 73
- 10. Most standard statistical software packages will construct a histogram fromthese data. Using MINITAB, we can specify the class width and the starting classmidpoint, or we can let MINITAB select these values. The output will contain thefrequency distribution as well as a graphical representation in the form of ahistogram (without the bars). MINITAB will provide each class frequency next tothe corresponding class midpoint (not class limits). Figure 3.5a contains the neces-sary MINITAB commands and the resulting output for the current ratio of MRKwhere the class width (CW) and the midpoint of the ﬁrst class were not speciﬁed.Figure 3.5b speciﬁed CW as .2000 and the ﬁrst midpoint as .905. We can use theoutput as it appears or use this information to construct Fig. 3.4b, which is agraphical representation of MRK’s current ratios as given in Table 3.10.Fig. 3.4 (a) Frequency histogram of JNJ’s current ratios as given in Table 3.10 (b) Frequencyhistogram of MRK’s current ratios as given in Table 3.1074 3 Frequency Distributions and Data Analyses
- 11. Histograms can also be used to chart the companies’ relative and cumulativefrequencies, as shown in Figs. 3.6 and 3.7. Note the similarity between the frequencyand relative frequency histograms (Figs. 3.4 and 3.6) and between the cumulativefrequency and the relative cumulative frequency graphs (Figs. 3.7 and 3.8); the onlydifference between them is the variable on the y-axis. Note also that geometrically,the relative frequency of each class in a frequency histogram equals its area dividedData DisplayabMRK1.332 1.532 1.216 0.973 1.27 1.515 1.6 1.475 1.6851.285 1.375 1.123 1.199 1.205 1.147 1.582 1.197 1.2271.348 1.805Histogram of MRK* NOTE * The character graph commands are obsolete.HistogramHistogram of MRK N = 20Midpoint Count1.0 1 *1.1 2 **1.2 5 *****1.3 4 ****1.4 1 *1.5 3 ***1.6 2 **1.7 1 *HistogramHistogram of MRK N = 20Midpoint Count0.905 1 *1.105 4 ****1.305 8 ********1.505 5 *****1.705 1 *1.905 1 *Fig. 3.5 (a) Histogram using MINITAB, where the class width and the midpoint of the ﬁrst classare not speciﬁed (b) Histogram using MINITAB using speciﬁed classes, where the class width is0.2000 and the ﬁrst midpoint is 0.9053.4 Graphical Presentation of Frequency Distribution 75
- 12. by the total area of all the classes. For example, the area for the ﬁrst class forMerck’s current ratio (Fig. 3.4b) is equal to the base of the bar times its height(0.19 Â 1 ¼ 0.19), and the sum of all the areas is 3.8. The relative frequency for theﬁrst class is thus .19/3.8 ¼ .05.3.4.2 Stem-and-Leaf DisplayAn alternative to histograms for the presentation of either nongrouped or groupeddata is the stem-and-leaf display. Stem-and-leaf displays were originally developedby John Tukey of Princeton University. They are extremely useful in summarizingdata sets of reasonable size (under 100 values as a general rule), and unlikehistograms, they result in no loss of information. By this, we mean that it is possibleFig. 3.6 (a) Relative frequency histogram of JNJ’s current ratios (b) Relative frequency histo-gram of MRK’s current ratios76 3 Frequency Distributions and Data Analyses
- 13. to reconstruct the original data set in a stem-and-leaf display, which we cannot dowhen using a histogram.For example, suppose a ﬁnancial analyst is interested in the amount of moneyspent by food product companies on advertising. He or she samples 40 of these foodproduct ﬁrms and calculates the amount that each spent last year on advertising as apercentage of its total revenue. The results are listed in Table 3.12.Let’s use this set of data to construct a stem-and-leaf display. In Fig. 3.9, eachobservation is represented by a stem to the left of the vertical line and a leaf to theright of the vertical line. For example, the stems and leaves for the ﬁrst threeobservations in Table 3.12 can be deﬁned asFig. 3.7 (a) Cumulative Frequency histogram of JNJ’s current ratios (b) Cumulative frequencyhistogram of MRK’s current ratios3.4 Graphical Presentation of Frequency Distribution 77
- 14. Stem Leaf12 0.58 0.811 0.5In other words, stems are the integer portions of the observations, whereas leavesrepresent the decimal portions.The procedure used to construct a stem-and-leaf display is as follows:1. Decide how the stems and leaves will be deﬁned.2. List the stems in a column in ascending order.Fig. 3.8 (a) Cumulative relative frequency histogram of JNJ’s current ratios (b) Cumulativerelative frequency histogram of MRK’s current ratios78 3 Frequency Distributions and Data Analyses
- 15. 3. Proceed through the data set, placing the leaf for each observation in theappropriate stem row. (You may want to place the leaves of each stem inincreasing order.)The percentage of revenues spent on advertising by 40 production ﬁrms listed inTable 3.12 is represented by a stem-and-leaf diagram in Fig. 3.9. From this diagram,we observe that the minimum percentage of advertising spending is 5.3 % of totalrevenue, the maximum percentage of advertising spending is 13.9 %, and thelargest group of ﬁrms spends between 9.1 % and 9.8 % of total revenue onadvertising. Also, the 7 leaves in stem row 7 indicate that 7 ﬁrms’ advertisingspending is at least 7 % but less than 8 %. The 3 leaves in stem row 13 tell us at aTable 3.12 Percentageof total revenue spenton advertisingCompany Percentage Company Percentage1 12.5 21 6.42 8.8 22 7.83 11.5 23 8.54 9.1 24 9.55 9.4 25 11.36 10.1 26 8.97 5.3 27 6.68 10.3 28 7.59 10.2 29 8.310 7.4 30 13.811 8.2 31 12.912 7.8 32 11.813 6.5 33 10.414 9.8 34 7.615 9.2 35 8.616 12.8 36 9.417 13.9 37 7.318 13.7 38 9.519 9.6 39 8.320 6.8 40 7.1Stems Leaves Frequency5 16 4 5 6 47 1 3 4 5 78 2 3 3 5 79 1 2 4 4 8 810 1 2 3386 8 86 8 95 5 64 411 3 5 8 312 5 8 9 313 7 8 9 3Total 40Fig. 3.9 Stem-and-leafdisplay for advertisingexpenditure3.4 Graphical Presentation of Frequency Distribution 79
- 16. glance that 3 ﬁrms spend more than 13 % of total revenue on advertising. AMINITAB version of the stem-and-leaf diagram generated by these data is shownin Fig. 13.10. A stem-and-leaf diagram is presented in the last portion of Fig. 3.10.In the ﬁrst column of this diagram, (8) represents the total observation in the middlegroup with a stem of 9; 1, 5, 12, and 19 represent the cumulative frequencies fromthe ﬁrst group up to the fourth group; and 3, 6, 9, and 13 represent the cumulativefrequencies from the ninth group up to the sixth group.3.4.3 Frequency PolygonA frequency polygon is obtained by linking the midpoints indicated on the x-axis ofthe class intervals from a frequency histogram. A cumulative frequency polygon isderived by connecting the midpoints indicated on the x-axis of the class intervalsfrom a cumulative frequency histogram. Figures 3.11 and 3.12 show the frequencypolygon and the cumulative frequency polygon, respectively, for JNJ’s currentratio. Although a histogram does demonstrate the shape of the data, perhaps theshape can be more clearly illustrated by using a frequency polygon.Data DisplayADV EXP12.5 8.8 11.5 9.1 9.4 10.1 5.3 10.310.2 7.4 8.2 7.8 6.5 9.8 9.2 12.813.9 13.7 9.9 6.8 6.4 7.8 8.5 9.511.3 8.9 6.6 7.5 8.3 13.8 12.9 11.810.4 7.6 8.6 9.4 7.3 9.5 8.3 7.1MTB > STEM AND LEAF USING ‘ADV EXP’Character Stem-and-Leaf DisplayStem-and-leaf of ADV EXP N = 40Leaf Unit = 0.101 5 35 6 456812 7 134568819 8 2335689(8) 9 1244558913 10 12349 11 3586 12 5893 13 789Fig. 3.10 Stem-and-leaf diagram for advertising expenditure using MINITAB80 3 Frequency Distributions and Data Analyses
- 17. 3.4.4 Pie ChartHistograms are perhaps the graphical forms most commonly used in statistics, butother pictorial forms, such as the pie chart, are often used to present ﬁnancial andmarketing data. For example, Fig. 3.13 depicts a family’s sources of income. Thispie chart indicates that 80 % of this family’s income comes from salary.For data already in frequency form, a pie chart is constructed by converting therelative frequencies of each class into their respective arcs of a circle. For example,a pie chart can be used to represent the student grade distribution data originallypresented in Table 3.3. In Table 3.13, the arcs (in degrees) for the ﬁve slices shownin Fig. 3.14 were obtained by multiplying each relative frequency by 360.Fig. 3.11 Frequency polygon of JNJ’s current ratiosFig. 3.12 Cumulative frequency polygon of JNJ’s current ratios3.4 Graphical Presentation of Frequency Distribution 81
- 18. 3.5 Further Economic and Business Applications3.5.1 Lorenz CurveThe Lorenz curve, which represents a society’s distribution of income, is a cumula-tive frequency curve used in economics (Fig. 3.15a). The cumulative percentage offamilies (ranked by income) is measured on the x-axis, and the cumulativeFig. 3.13 Sources of family incomeTable 3.13 Gradedistribution for 20 studentsClass Frequency Relative frequency Arc (degrees)Below 60 3 0.15 5460–69 3 0.15 5470–79 6 0.30 10880–89 4 0.20 7290–99 4 0.20 72Total 20 1.00 360Fig. 3.14 Grade distribution pie chart82 3 Frequency Distributions and Data Analyses
- 19. aCumulative Percentageof Family IncomeCumulative Percentageof Families100908070605040Area IArea II30201000 10 20 30 40 50 60 70 80 90 100BPNHCAOCumulative Percentageof FamiliesCumulative Percentageof Family Income0 10 20 30 40 50 60 70 80 90 1001009080706050403020100bPNHSFig. 3.15 (a) and (b) Lorenz curves3.5 Further Economic and Business Applications 83
- 20. percentage of family income received is measured on the y-axis. For example,suppose there are 100 families, and each earns $100 – that is, the distribution ofincome is perfectly equal. The resulting Lorenz curve will be a 45line (OP),because the cumulative percentage of families (e.g., 40 %) and the cumulative shareof family income received are always equal.Now suppose that one family receives 100 % of total family income – that is, theincome distribution is absolutely unequal. The resulting Lorenz curve (ONP)coincides with the x-axis until point N, where there is a discontinuous jump topoint P. This is because, with the exception of that single family (represented bypoint N), each family receives 0 % of total family income. Therefore, thesefamilies’ cumulative share of total family income is also 0 %.The shape the Lorenz curve is most likely to assume is curve H, which liesbetween absolute inequality and equality. This curve indicates that the lowest-income families, who comprise 40 % of families (point A), receive a disproportion-ately small share (about 7 %) of total family income (point C). If every family hadthe same income, the share going to the lowest 40 % would be represented by pointB (40 %).Note that with a more equitable distribution of income, the Lorenz curve is lessbowed, or ﬂatter. Curve S in Fig. 3.15b is the Lorenz curve after a progressiveincome tax is imposed. Because S is ﬂatter than H (which is reproduced fromFig. 3.15a), we can conclude that the distribution of income (after taxes) is morenearly equal than before, as would be expected.One way to measure the inequality of income from the Lorenz curve is to use theGini coefﬁcient.Gini coefficient for curve H ¼area Iarea ðI þ IIÞThe Gini coefﬁcient can range from 0 (perfect equality) to 1 (absolute inequality,wherein one family receives all the income).Examining Fig. 3.15b reveals that the Gini coefﬁcient will be smaller for curve Sthan it is for curve H. In other words, the progressive income tax makes thedistribution of income more nearly equal.3.5.2 Stock and Market Rate of ReturnTable 3.14 presents the frequency tables for the rate of return for Johnson andJohnson, Merck, and the stock market overall. (The data are drawn from Table 2.4in Appendix 2 of Chap. 2.) Because the two ﬁrms have similar frequencydistributions, we can conclude that the performances of Johnson and Johnson andMerck’s stocks have been similar. However, Johnson and Johnson’s highest class is84 3 Frequency Distributions and Data Analyses
- 21. 0.001–0.200, while Merck’s highest classes are spread but found at À0.200 andbelow and at 0.201–0.400.The stock market’s overall lowest class was found at À0.200 and below, but itshighest class was only 0.001–0.200. Thus, the overall market has ﬂuctuated lessthan the return of the two pharmaceutical ﬁrms. And although Johnson and Johnsonand Merck have a higher top class, the market suffered through fewer negativereturns. Moreover, Johnson and Johnson and Merck had 9 and 8 years, respectively,of losses, while the market had only ﬁve. In other words, the pharmaceutical ﬁrmsoffered the potential of higher returns but also threatened the investor with a greaterrisk of loss.3.5.3 Interest RatesHistograms can be used to summarize movements in such interest rates as the primerate and the treasury bill rate. The prime rate is the interest rate that banks charge totheir best customers; treasury bills are short-term debt instruments issued by the USTable 3.14 Rates of return for JNJ and MRK stock and the SP 500ClassFrequency(years)CumulativefrequencyRelativefrequencyCumulative relativefrequencyJNJÀ0.200 and below 4 4 0.1905 0.1905À0.199 to 0.000 5 9 0.2381 0.42860.001–0.200 5 14 0.2381 0.66670.201–0.400 5 19 0.2381 0.90480.401–0.600 1 20 0.0476 0.95240.601–1.00 1 21 0.0476 1.0000Total 21 1.000MRKÀ0.200 and below 5 5 0.2381 0.2381À0.199 to 0.000 3 8 0.1429 0.38100.001–0.200 3 11 0.1429 0.52380.201–0.400 5 16 0.2381 0.76190.401–0.600 3 19 0.1429 0.90480.601–1.00 2 21 0.0952 1.0000Total 21 1.000SP 500 (market)À0.200 and below 1 1 0.0476 0.0476À0.199 to 0.000 4 5 0.1905 0.23810.001–0.200 11 16 0.5238 0.76190.201–0.400 5 21 0.2381 1.0000Total 21 1.0003.5 Further Economic and Business Applications 85
- 22. government. Let us examine how these rates have moved over the period1990–2009, as shown in Table 3.15.As can be seen in Table 3.16 and Fig. 3.16, the prime rate is skewed to the right,with 65 % of the observations appearing in the ranges made up of the slightly highermidrange interest rates (6–6.9 %, 7–7.9 %, and 8–8.9 %). If you were to predict afuture value for the prime rate, your best guess would be in the 6–9 % range. Thiswide range would probably not be of much use. Better methods for prediction, suchas multiple regression and time series analysis, will be discussed later (Chaps. 15and 18).Table 3.15 3-Month T-billrate and prime rate(1990–2009)Year 3-Month T-bill rate Prime rate90 7.49 10.0191 5.38 8.4692 3.43 6.2593 3.00 6.0094 4.25 7.1495 5.49 8.8396 5.01 8.2797 5.06 8.4498 4.78 8.3599 4.64 7.9900 5.82 9.2301 3.39 6.9202 1.60 4.6803 1.01 4.1204 1.37 4.3405 3.15 6.1906 4.73 7.9607 4.35 8.0508 1.37 5.0909 0.15 3.25Table 3.16 Frequency distributions of interest ratesT-bill Prime rateClass (%) Frequency Relative frequency Frequency Relative frequency0–1.99 0 0.00 5 0.252–2.99 0 0.00 0 0.003–3.99 1 0.05 4 0.204–4.99 3 0.15 5 0.255–5.99 1 0.05 5 0.256–6.99 4 0.20 0 0.007–7.99 3 0.15 1 0.058–8.99 6 0.30 0 0.009–9.99 1 0.05 0 0.0010–10.99 1 0.05 0 0.00Total 20 1.00 20 1.0086 3 Frequency Distributions and Data Analyses
- 23. The frequency table for the treasury bill rate is shown in Table 3.16. Thisdistribution, like that of the prime rate, is skewed to the right. Fifty percent of theobservations appear in the third and fourth classes, 4–4.9 % and 5–5.9 %. Thisdistribution is depicted in the histogram shown in Fig. 3.17.If you were to make a prediction of the treasury bill rate, it would probably be inthe 3–6 % range. Again, better methods for predicting observations will bediscussed later.10.009.008.007.006.005.004.003.002.001.000.001 - 2.9Frequency3.0 - 4.9 5.0 - 6.9Interest Rates7.0 - 8.9 9.0 - 11Fig. 3.16 Frequency histogram of prime lending rates given in Table 3.156.005.004.003.002.001.000.000 - 1.9 2 - 2.9 3 - 3.9 4 - 4.9Interest RateFrequency5 - 5.9 6 - 6.9 7 - 7.9Fig. 3.17 Frequency histogram of T-bill rates given in Table 3.153.5 Further Economic and Business Applications 87
- 24. 3.5.4 Quality ControlFigure 3.18 depicts the quality control data on electronic parts given in Table 3.17.This control chart shows the percentage of defects for each sample lot. Figure 3.18indicates that both lots 5 and 7 have exceeded the allowed maximum defect level of3 %. Therefore, the product quality in these two lots should be improved.Percentage Defective432101 2 3 4 5 6 7 8Lot NumberFig. 3.18 Frequency bar graph of the percentage of defects for each sample lotTable 3.17 Quality controlreport on electronic partsSample Lot Sample Defects Percentage1 1,000 15 1.52 1,000 20 2.03 1,000 17 1.74 1,000 25 2.55 1,000 35 3.56 1,000 20 2.07 1,000 36 3.68 1,000 28 2.8Total 8,000 196 2.45 (mean)88 3 Frequency Distributions and Data Analyses
- 25. 3.6 SummaryIn this chapter, we extended the discussion of Chap. 2 by showing how data can begrouped to make analysis easier. After the data are grouped, frequency tables,histograms, stem-and-leaf displays, and other graphical techniques are used topresent them in an effective and memorable way.Our ultimate goal is to use a sample to make inferences about a population.Unfortunately, neither the tabular nor the graphical approach lends itself to mea-suring the reliability of an inference in data analysis. To do this, we must developnumerical measures for describing data sets. Therefore, in the next chapter, weshow how data can be described by the use of descriptive statistics such as themean, standard deviation, and other summary statistical measures.Questions and Problems1. Explain the difference between grouped and nongrouped data.2. Explain the difference between frequency and relative frequency.3. Explain the difference between frequency and cumulative frequency.4. Carefully explain how the concept of cumulative frequency can be used to formthe Lorenz curve.5. Suppose you are interested in constructing a frequency distribution for theheights of 80 students in a class. Describe how you would do this.6. What is a frequency polygon? Why is a frequency polygon useful in datapresentation?7. Use the prime rate data given in Table 3.6 in the text to construct cumulativefrequency and cumulative relative frequency tables.8. Use the percentage of total revenue spent on advertising listed in Table 3.12of the text to draw a frequency polygon and a cumulative frequency polygon.9. On November 17, 1991, the Home News of central New Jersey used the barchart given here to show that foreign investors are taxed at a lower rate than theUS citizens.(a) Construct a table to show frequency, relative frequency, and cumulativefrequency.(b) Draw a frequency polygon and a cumulative frequency polygon.Questions and Problems 89
- 26. Foreign investors get big tax breakson money made in the U.S.1988 tax rates, in percent.Rate formiddle-income* U.S.taxpayers:10.7%. . . If taxed at the samerate. granted Kuwaitiinvestors in the U.S..the American wouldhave paid just$454An American taxpayerearning $30,000 to$40,000 paid$3,708 . . .Japan6.065.635.214.644.394.393.953.672.502.241.971.861.291.02.76.14CaymanIslandsSpainGreatBritainCanadaFranceItalyNetherlandsSaudiArabiaBelgiumKuwaitNeth.AntillesTaiwanSingaporeFinlandUnitedArabEmiratesSource: Philadelphia Inquirer, Internal Revenue Service.*$30,000to $40,000Source: Home News, November 17, 1991, Reprinted by permission of Knighi-Ridder Tribune News10. Use the EPS and DPS data given in Table 2.3 in Chap. 2 to construct frequencydistributions.11. Use the data from question 10 to construct a relative frequency graph and acumulative relative frequency graph for both EPS and DPS.12. On November 17, 1991, the Home News of central New Jersey used the barchart in the accompanying ﬁgure to show the 1980–1991 passenger trafﬁctrends for Newark International Airport.(a) Use these data to draw a line chart and interpret your results.(b) Use these data to draw a stem-and-leaf diagram and interpret your results.90 3 Frequency Distributions and Data Analyses
- 27. 30Passengers (in millions)252015105080 81 829.210.212.017.423.628.829.423.422.420.922.383 84 85 86 87 88 89 90YearSource: Port Authority of NY and NJ91Source: Home News, November 17, 1991. Reprinted by permission of thepublisher13. An advertising executive is interested in the age distribution of the subscribersto Person magazine. The age distribution is as follows:Age Number of subscribers18–25 10,00026–35 25,00036–45 28,00046–55 19,00056–65 10,000Over 65 7,000(a) Use a frequency distribution graph to present these data.(b) Use a relative frequency distribution to present these data.14. Use the data from question 13 to produce a cumulative frequency graph and acumulative relative frequency graph.Questions and Problems 91
- 28. 15. Construct stem-and-leaf displays for the 3-month T-bill rate and the prime rate,using the data listed in Table 3.15.Use the goaltenders’ salaries for the 1991 NHL season given in the followingtable to answer questions 16–20.Name Team Gross salaryPatrick Roy Montreal Canadiens $1.056MaEd Belfour Chicago Blackhawks $925,000Ron Hextall Philadelphia Flyers $735,000Mike Richter New York Rangers $700,000Kelly Hrudey Los Angeles Kings $550,000Mike Liut Washington Capitals $525,000Mike Vernon Calgary Flames $500,000Grant Fuhr Toronto Maple Leafs $424,000John Vanbiesbrouck New York Rangers $375,000Ken Wregget Philadelphia Flyers $375,000Tom Barrasso Pittsburgh Penguins $375,000aRoy’s salary is $500,000 Canadian, and $700,000 Canadian deferred. The salary listed is USequivalent16. Group the data given in the table into the following groups: $351,000–400,000;401,000–450,000; 451,000–500,000; 501,000–550,000; 551,000–600,000;601,000–650,000; 651,000–700,000; over 701,000.17. Use your results from question 16 to construct a cumulative frequency table.18. Use your results from question 16 to construct a relative frequency table and acumulative relative frequency table.19. Use a bar graph to plot the frequency distribution.20. Use a bar graph to plot the cumulative relative frequency.21. Brieﬂy explain why the Lorenz curve shown in Fig. 3.15b has the shape it does.22. The students in an especially demanding history class earned the followinggrades on the midterm exam: 86, 75, 92, 98, 71, 55, 63, 82, 94, 90, 80, 62, 62,65, and 68. Use MINITAB to draw a stem-and-leaf graph of these grades.23. Use the data given in question 22 to construct a tally table for the grades. Useintervals 51–60, 61–70, 71–80, 81–90, and 91–100.24. Construct a cumulative frequency table for the tally table you constructed inquestion 23.25. Use the data in question 24 to graph the cumulative frequency on a bar chart byusing Microsoft Excel.26. Suppose the Gini coefﬁcient in some country were equal to 0. What would thattell us about income in this country?27. Suppose the Gini coefﬁcient in another country were equal to 1. What wouldthat tell us about income in this country?Use the following information to answer questions 28–34. Suppose WeightWatchers has collected the following weight loss data, in pounds, for 30 of itsclients.15, 20, 10, 6, 8, 18, 32, 17, 19, 7, 9, 12, 14, 9, 25, 18, 21, 3, 2, 18, 12, 15, 14,28, 34, 30, 18, 12, 11, 892 3 Frequency Distributions and Data Analyses
- 29. 28. Construct a tally table for weight loss. Use 5-lb intervals beginning with 1–5 lb,6–10 lb, etc.29. Construct a cumulative frequency table for weight loss.30. Construct a frequency histogram for weight loss using MINITAB.31. Construct a frequency polygon for weight loss.32. Construct a table for the relative frequencies and the cumulative relativefrequencies.33. Graph the relative frequency.34. Graph the cumulative relative frequency.35. The following graph shows the Lorenz curves for two countries, Modestia andRichardonia. Which country has the most nearly equal distribution of income?Cumulative Percentageof Family IncomeCumulative Percentageof FamiliesModestiaRichardoniaUse the following information to answer questions 36–41. Suppose a class ofhigh school seniors had the following distribution of SAT scores in English.SAT score Number of students401–450 8451–500 10501–550 15551–600 6601–650 4651–700 136. Construct a cumulative frequency table.37. Use a histogram to graph the cumulative frequencies.38. Construct a frequency polygon.39. Compute the relative frequencies and the cumulative relative frequencies.40. Construct a relative frequency histogram.41. Construct a cumulative relative frequency histogram.Use the following prices of Swiss stocks to answer questions 42 through 49.Questions and Problems 93
- 30. Switzerland (in Swiss francs) Close Prev. close1. Alusuisse 976 9822. Brown Boveri 3,960 4,0803. Ciba-Geigy br 3,190 3,2404. Ciba-Geigy reg 3,080 3,1105. Ciba-G ptc ctf 3,020 3,0406. CS Holding 1,920 1,9157. Hof LaRoch br 8,280 8,3008. Roce div rt 5,360 5,3309. Nestle bearer 8,420 8,45010. Nestle reg 8,310 8,31011. Nestle ptc ctf 1,570 1,58512. Sandoz 2,390 2,41013. Sulzer 465 47014. Swiss Bank Cp 301 29915. Swiss Reinsur 2,520 2,53016. Swissair 667 68017. UBS 3,230 3,23018.Winterthur 3,390 3,42019. Zurich Ins 4,080 4,090Source: Wall Street Journal, November 1, 199142. Construct a tally table for the closing stock prices “Close” column. Use 1,000-point intervals beginning with 301–1,300, 1,301–2,301, etc.43. Compute the change in prices by subtracting the previous closing price fromthe current closing price.44. Use your answer to question 43 to construct a tally table. Use 30-point intervalsbeginning with À120 ~ À91,–90 ~ À61, etc.45. Use your answer to question 44 to compute the cumulative frequencies.46. Use your answer to question 44 to compute the relative and cumulative relativefrequencies.47. Use your answer to question 46 to graph the relative frequency.48. Use your answer to question 46 to graph the cumulative frequency.49. Create a frequency polygon using data from question 44.50. Draw the stem-and-leaf display of DPS of JNJ and Merck during the period1988–2009 using Table 2.3, in which data on EPS, DPS, and PPS for JNJ,Merck, and SP 500 during the period 1988–2009 are given.51. Refer to Table 2.5, in which the balance sheet of JNJ company for the year 2008and 2009 are given. Draw the pie chart of the composition of the total currentasset of JNJ for the year 2008 and 2009, respectively.Using Table 2.8, in which the 7 ﬁnancial ratios of JNJ and Merck during theperiod 1990–2009 are given.52. Construct a frequency, cumulative frequency, and relative frequency table forthe “price–earnings ratio” (PER) of the JNJ company using class boundaries:À20.000 PER 0.000, 0.000 PER 5.000, 5.000 PER 10.000,10.000 PER 15.000, 15.000 PER 20.000, 20.000 PER 27.000.53. Draw the histogram and frequency polygon of the above frequency distribution.94 3 Frequency Distributions and Data Analyses

Be the first to comment