Upcoming SlideShare
×

# Statistics for business and financial economics

297

Published on

Published in: Technology, Economy & Finance
1 Like
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

Views
Total Views
297
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
0
0
Likes
1
Embeds 0
No embeds

No notes for slide

### Transcript of "Statistics for business and financial economics"

1. 1. Chapter 3Frequency Distributions and Data AnalysesChapter Outline3.1 Introduction ............................................................................... 653.2 Tally Table for Constructing a Frequency Table ......................................... 663.3 Three Other Frequency Tables ............................................................ 703.4 Graphical Presentation of Frequency Distribution ....................................... 723.5 Further Economic and Business Applications ............................................ 823.6 Summary .................................................................................. 89Questions and Problems ........................................................................ 89Key TermsGrouped data HistogramsRaw (nongrouped) data Stem-and-leaf displaysFrequency Frequency polygonFrequency table Cumulative frequency polygonFrequency distribution Pie chartCumulative frequencies Lorenz curveRelative frequency Gini coefﬁcientCumulative relative frequency Absolute inequality3.1 IntroductionUsing the tabular and graphical methods discussed in Chap. 2, we will now developtwo general ways to describe data more fully. We discuss ﬁrst the tally tableapproach to depicting data frequency distributions and then three other kinds offrequency tables. Next, we explore alternative graphical methods for describingfrequency distributions. Finally, we study further applications for frequencydistributions in business and economics.C.-F. Lee et al., Statistics for Business and Financial Economics,DOI 10.1007/978-1-4614-5897-5_3, # Springer Science+Business Media New York 201365
2. 2. 3.2 Tally Table for Constructing a Frequency TableBefore conducting any statistical analysis, we must organize our data sets. One wayto organize data is by using a tally table as a worksheet for setting up a frequencytable. To set up a tally table for a set of data, we split the data into equal-sizedclasses in such a way that each observation ﬁts into one and only one class ofnumbers (i.e., the classes are mutually exclusive). Sometimes data are reported in afrequency table with class intervals given but with actual values of observations inthe classes unknown; data presented in this manner are called grouped data. Theanalyst assigns each data point to a class and enters a tally mark made by that class.Let’s see how this works.Example 3.1 Tallying Scores from a Statistics Exam. Suppose a statistics professorwants to summarize how 20 students performed on an exam. Their scores are asfollows: 78, 56, 91, 59, 78, 84, 65, 97, 84, 71, 84, 44, 69, 90, 73, 77, 80, 90, 68, and75. Data in this form are called nongrouped data or raw data. We can use a tallytable like Table 3.1 to list the number of occurrences, of frequency, of each score.A corresponding diagram is shown in Fig. 3.1.This table presents nongrouped data, and no pattern emerges from them. As analternative, the data can be grouped into classes by letter grade. If the professor usesa straight grading scale, the classes might be 90–99, 80–89, 70–79, 60–69, andbelow 60. After establishing the classes, the professor counts scores in each classand records these numbers to obtain a tally sheet, as shown in Table 3.2 andFig. 3.2.Note that each observation is included in one and only one class. The tallies arecounted, and a frequency table is constructed as shown in Table 3.3, where lettergrades are assigned to each class.Table 3.1 Student examscoresScore Tallies Frequency44 / 156 / 159 / 165 / 168 / 169 / 171 / 173 / 175 / I77 / 178 // 280 / 184 /// 390 // 291 / 197 / 1Total 2066 3 Frequency Distributions and Data Analyses
3. 3. Frequency44 56 59 65 68 69 71 73 75 77 78 80 84 90 91 97Score3.02.82.62.42.22.01.81.61.41.21.0.8.6.4.20Fig. 3.1 Bar graph for nongrouped student exam scores given in Table 3.1Table 3.2 Tally table forstatistics exam scoresClass Tally FrequencyBelow 60 /// 360–69 /// 370–79 ////// 680–89 //// 490–99 //// 420Frequency6543210Below 60 60-69 70-79 80-89 90-99ScoreFig. 3.2 Bar graph for grouped student exam scores given in Table 3.23.2 Tally Table for Constructing a Frequency Table 67
4. 4. Example 3.2 A Frequency Distribution of Grade Point Averages. Suppose thatthere are 30 students in a classroom and that they have the grade point averageslisted in Table 3.4. A tally table is constructed, in which classes are (arbitrarily)deﬁned at every half-point and each tally marked next to a particular class accountsfor one data entry. The entries are then counted to obtain a frequency distribution,as shown in Table 3.5. A frequency distribution simply shows how manyobservations fall into each class. We will discuss this concept in further detail inthe next section.Generally, a data set should be divided into 5–15 classes. Having too few or toomany classes gives too little information. Imagine a frequency distribution withonly two classes: 0.0–2.0 and 2.1–4.0. With such broadly deﬁned classes, it isdifﬁcult to distinguish among GPAs. Similarly, if the class interval were only one-tenth of a point, the large number of classes, each with only one or a few tallies,would make summarizing the data almost impossible.Table 3.3 Frequency tablefor statistics exam scoresClass Grade FrequencyBelow 60 F 360–69 D 370–79 C 680–89 B 490–99 A 420Table 3.4 Student GPAs:raw data1.2 3.9 1.93.8 2.4 2.72.3 2.3 2.60.7 3.1 3.73.6 2.9 4.02.2 2.7 1.21.9 0.8 1.82.1 0.3 2.43.1 3.2 3.20.8 3.1 3.6Table 3.5 Student GPAs:tally table and frequencydistributionRange Tallies FrequencyBelow 1.5 ////// 61.5–1.9 /// 32.0–2.4 ////// 62.5–2.9 //// 43.0–3.4 ///// 53.5–4.0 ////// 6Total 3068 3 Frequency Distributions and Data Analyses
5. 5. In the GPA example, it was relatively easy to construct the classes because GPAcutoffs were used. However, in most examples, there are no natural dividing linesbetween classes. The following guidelines can be used to construct classes:1. Construct from 5 to 15 classes. This step is the most difﬁcult, because using toomany classes defeats the purpose of grouping the data into classes, whereashaving too few classes limits the amount of information obtained from the data.As a general rule, when the range and number of observations are large, moreclasses can be deﬁned. Fewer classes should be constructed when the number ofobservations is only around 20 or 30.2. Make sure each observation falls into only one class. This can often be accom-plished by deﬁning class boundaries in terms of several decimal places. If thepercentage return on stocks is carried to one decimal place, for example, thendeﬁning the classes by using two decimal places will ensure that each observa-tion falls into only one class.3. Try to construct classes with equal class intervals. This may not be possible,however, if there are outlying observations in the data set.Example 3.3 A Frequency Distribution of 3-Month Treasury Bill Rates. Table 3.6presents another example, and here the data presented are the interest rates on3-month treasury bills (T-bills) from 1990 to 2009. (T-bills are debt instrumentssold by the US government to ﬁnance its budgetary needs.) The annual data forinterest rates (average daily rates for a year) are taken from Economic Report of thePresident, January 2009.As we have noted, a frequency distribution gives the total number of occurrencesin each class. In the next chapter, we will talk about using a frequency distributionto present data.By setting up a tally table and a frequency table, we can scrutinize data forerrors. For example, if the data value 123 appears in a column for the rate in theT-bill example, a mistake has clearly been made – one that could be due to amissing decimal point. Probably, the data point could be 12.3 % instead, whichmakes more sense because it is in the range of the other data points. Data shouldalso be checked for accuracy. Otherwise, invalid conclusions could be reached.Table 3.6 T-bill interestrates, 1990–2009Class (%) Tallies Frequency0–1.49 //// 41.50–3.49 ///// 53.50–5.49 ///////// 95.50–6.49 / 16.50 and greater / 1Total 203.2 Tally Table for Constructing a Frequency Table 69
7. 7. Example 3.5 Frequency Distributions of Current Ratios for JNJ and MRK. Thecurrent ratios for JNJ and MRK from 1990 to 2009 are shown in Table 3.9.A frequency distribution for the current ratios of Johnson and Johnson and Merckis shown in Table 3.10. This ratio is a measure of liquidity, which (as we noted inChap. 2) indicates how quickly a ﬁrm can obtain cash for operations. The ﬁrstcolumn deﬁnes the classes. Note that the use here of class boundaries ensures thateach observation will fall into only one class.The next column shows the frequency – that is, the number of times that anobservation appears in each class. In Table 3.10, we see that JNJ experienced onecurrent ratio between 1.0 and 1.2, seven between 1.201 and 1.700, and so on. Thethird column presents the cumulative frequency. Because there are 20 observationsin the population, the cumulative frequency for the last class is 20.The fourth column presents the relative frequency, which measures the percentageof observations in each class. Relative frequencies can be thought of as probabilities.For example, the probability that an observation is in the ﬁrst class is 0.1.Table 3.9 Current ratio forJNJ and MRKYear JNJ MRK1990 1.778 1.3321991 1.835 1.5321992 1.582 1.2161993 1.624 0.9731994 1.566 1.2701995 1.809 1.5151996 1.807 1.6001997 1.999 1.4751998 1.364 1.6851999 1.771 1.2852000 2.164 1.3752001 2.296 1.1232002 1.683 1.1992003 1.710 1.2052004 1.962 1.1472005 2.485 1.5822006 1.199 1.1972007 1.510 1.2272008 1.649 1.3482009 1.820 1.805Table 3.8 Relative frequency table for grade distributionClass Grade Relative frequency Cumulative relative frequencyBelow 60 F 0.15 0.1560–69 D 0.15 0.3070–79 C 0.30 0.6080–89 B 0.20 0.8090–99 A 0.20 1.003.3 Three Other Frequency Tables 71
8. 8. The last column indicates the cumulative relative frequency, which measures thepercentage of observations in a particular class and all previous classes. Thecumulative relative frequency for Merck’s fourth class is calculated by addingthe relative frequencies of the ﬁrst four classes to arrive at 0.95. That is, 95 % ofthe observations occur in the ﬁrst four classes. The cumulative relative frequencyof the last class always equals 1, because the last class includes all the observations.3.4 Graphical Presentation of Frequency DistributionWe have spoken before of the special effectiveness of using graphs to present data.In this section, we discuss four different graphical approaches to presenting fre-quency distributions.3.4.1 HistogramsFrequency distributions can be represented on a variety of graphs. The histogram,which is one of the most commonly used types, is similar to the bar charts discussedin Chap. 2 except that1. Neighboring bars touch each other.2. The area inside any bar (its height times its width) is proportional to the numberof observations in the corresponding class.To illustrate these two points, suppose the age distribution of personnel at asmall business is as shown in Table 3.11.Table 3.10 Frequency distributions of current ratios for JNJ and MRKClass FrequencyCumulativefrequencyRelativefrequencyCumulative relativefrequencyJNJ1.00–1.2 1 1 0.05 0.051.21–1.4 1 2 0.05 0.11.41–1.60 3 5 0.15 0.251.61–1.80 6 11 0.3 0.551.81–2.00 6 17 0.3 0.852.01–2.5 3 20 0.15 1.00Total 20 1.00MRK0.81–1.00 1 1 0.05 0.051.01–1.2 4 5 0.2 0.251.21–1.4 8 13 0.6 0.651.41–1.60 5 18 0.25 0.91.61–1.80 1 19 0.05 0.951.81–2.00 1 20 0.05 1.00Total 20 1.0072 3 Frequency Distributions and Data Analyses
9. 9. To construct a histogram, we need to enter a scale on the horizontal axis.Because the data are discrete, there is a gap between the class intervals, say between20 and 29 and 30–39. In such a case, we will use the midpoint between the end ofone class and the beginning of the next as our dividing point. Between the 20–29interval and 30–39 interval, the dividing point will be (29 þ 30)/2 ¼ 29.5. We ﬁndthe dividing point between the remaining classes similarly.To satisfy the second condition, we note that all ﬁve classes have an intervalwidth of 10 years. Figure 3.3 is the histogram that reﬂects these data.Drawn from the data of Table 3.10, Fig. 3.4a, b are histograms of JNJ’s andMRK’s current ratios for the years 1990–2009.The x-axis indicates the classes andthe y-axis the frequencies. As the histograms show, MRK’s current ratios havetended to fall in the 1.0–1.4 range, whereas those of JNJ show no exact pattern, butmany can be found in the 1.61–2.00 range. (In Chap. 4, we will cover measures ofskewness, which give us more insight into the shape of a distribution.)Table 3.11 Age distributionof personnelClass Frequency20–29 330–39 640–49 750–59 460–69 170–79 1Frequency7654321019.5 29.5 39.5 49.5 59.5 69.5 79.5AgeFig. 3.3 Histogram of age distribution given in Table 3.113.4 Graphical Presentation of Frequency Distribution 73
10. 10. Most standard statistical software packages will construct a histogram fromthese data. Using MINITAB, we can specify the class width and the starting classmidpoint, or we can let MINITAB select these values. The output will contain thefrequency distribution as well as a graphical representation in the form of ahistogram (without the bars). MINITAB will provide each class frequency next tothe corresponding class midpoint (not class limits). Figure 3.5a contains the neces-sary MINITAB commands and the resulting output for the current ratio of MRKwhere the class width (CW) and the midpoint of the ﬁrst class were not speciﬁed.Figure 3.5b speciﬁed CW as .2000 and the ﬁrst midpoint as .905. We can use theoutput as it appears or use this information to construct Fig. 3.4b, which is agraphical representation of MRK’s current ratios as given in Table 3.10.Fig. 3.4 (a) Frequency histogram of JNJ’s current ratios as given in Table 3.10 (b) Frequencyhistogram of MRK’s current ratios as given in Table 3.1074 3 Frequency Distributions and Data Analyses
11. 11. Histograms can also be used to chart the companies’ relative and cumulativefrequencies, as shown in Figs. 3.6 and 3.7. Note the similarity between the frequencyand relative frequency histograms (Figs. 3.4 and 3.6) and between the cumulativefrequency and the relative cumulative frequency graphs (Figs. 3.7 and 3.8); the onlydifference between them is the variable on the y-axis. Note also that geometrically,the relative frequency of each class in a frequency histogram equals its area dividedData DisplayabMRK1.332 1.532 1.216 0.973 1.27 1.515 1.6 1.475 1.6851.285 1.375 1.123 1.199 1.205 1.147 1.582 1.197 1.2271.348 1.805Histogram of MRK* NOTE * The character graph commands are obsolete.HistogramHistogram of MRK N = 20Midpoint Count1.0 1 *1.1 2 **1.2 5 *****1.3 4 ****1.4 1 *1.5 3 ***1.6 2 **1.7 1 *HistogramHistogram of MRK N = 20Midpoint Count0.905 1 *1.105 4 ****1.305 8 ********1.505 5 *****1.705 1 *1.905 1 *Fig. 3.5 (a) Histogram using MINITAB, where the class width and the midpoint of the ﬁrst classare not speciﬁed (b) Histogram using MINITAB using speciﬁed classes, where the class width is0.2000 and the ﬁrst midpoint is 0.9053.4 Graphical Presentation of Frequency Distribution 75
12. 12. by the total area of all the classes. For example, the area for the ﬁrst class forMerck’s current ratio (Fig. 3.4b) is equal to the base of the bar times its height(0.19 Â 1 ¼ 0.19), and the sum of all the areas is 3.8. The relative frequency for theﬁrst class is thus .19/3.8 ¼ .05.3.4.2 Stem-and-Leaf DisplayAn alternative to histograms for the presentation of either nongrouped or groupeddata is the stem-and-leaf display. Stem-and-leaf displays were originally developedby John Tukey of Princeton University. They are extremely useful in summarizingdata sets of reasonable size (under 100 values as a general rule), and unlikehistograms, they result in no loss of information. By this, we mean that it is possibleFig. 3.6 (a) Relative frequency histogram of JNJ’s current ratios (b) Relative frequency histo-gram of MRK’s current ratios76 3 Frequency Distributions and Data Analyses
13. 13. to reconstruct the original data set in a stem-and-leaf display, which we cannot dowhen using a histogram.For example, suppose a ﬁnancial analyst is interested in the amount of moneyspent by food product companies on advertising. He or she samples 40 of these foodproduct ﬁrms and calculates the amount that each spent last year on advertising as apercentage of its total revenue. The results are listed in Table 3.12.Let’s use this set of data to construct a stem-and-leaf display. In Fig. 3.9, eachobservation is represented by a stem to the left of the vertical line and a leaf to theright of the vertical line. For example, the stems and leaves for the ﬁrst threeobservations in Table 3.12 can be deﬁned asFig. 3.7 (a) Cumulative Frequency histogram of JNJ’s current ratios (b) Cumulative frequencyhistogram of MRK’s current ratios3.4 Graphical Presentation of Frequency Distribution 77
14. 14. Stem Leaf12 0.58 0.811 0.5In other words, stems are the integer portions of the observations, whereas leavesrepresent the decimal portions.The procedure used to construct a stem-and-leaf display is as follows:1. Decide how the stems and leaves will be deﬁned.2. List the stems in a column in ascending order.Fig. 3.8 (a) Cumulative relative frequency histogram of JNJ’s current ratios (b) Cumulativerelative frequency histogram of MRK’s current ratios78 3 Frequency Distributions and Data Analyses
15. 15. 3. Proceed through the data set, placing the leaf for each observation in theappropriate stem row. (You may want to place the leaves of each stem inincreasing order.)The percentage of revenues spent on advertising by 40 production ﬁrms listed inTable 3.12 is represented by a stem-and-leaf diagram in Fig. 3.9. From this diagram,we observe that the minimum percentage of advertising spending is 5.3 % of totalrevenue, the maximum percentage of advertising spending is 13.9 %, and thelargest group of ﬁrms spends between 9.1 % and 9.8 % of total revenue onadvertising. Also, the 7 leaves in stem row 7 indicate that 7 ﬁrms’ advertisingspending is at least 7 % but less than 8 %. The 3 leaves in stem row 13 tell us at aTable 3.12 Percentageof total revenue spenton advertisingCompany Percentage Company Percentage1 12.5 21 6.42 8.8 22 7.83 11.5 23 8.54 9.1 24 9.55 9.4 25 11.36 10.1 26 8.97 5.3 27 6.68 10.3 28 7.59 10.2 29 8.310 7.4 30 13.811 8.2 31 12.912 7.8 32 11.813 6.5 33 10.414 9.8 34 7.615 9.2 35 8.616 12.8 36 9.417 13.9 37 7.318 13.7 38 9.519 9.6 39 8.320 6.8 40 7.1Stems Leaves Frequency5 16 4 5 6 47 1 3 4 5 78 2 3 3 5 79 1 2 4 4 8 810 1 2 3386 8 86 8 95 5 64 411 3 5 8 312 5 8 9 313 7 8 9 3Total 40Fig. 3.9 Stem-and-leafdisplay for advertisingexpenditure3.4 Graphical Presentation of Frequency Distribution 79
16. 16. glance that 3 ﬁrms spend more than 13 % of total revenue on advertising. AMINITAB version of the stem-and-leaf diagram generated by these data is shownin Fig. 13.10. A stem-and-leaf diagram is presented in the last portion of Fig. 3.10.In the ﬁrst column of this diagram, (8) represents the total observation in the middlegroup with a stem of 9; 1, 5, 12, and 19 represent the cumulative frequencies fromthe ﬁrst group up to the fourth group; and 3, 6, 9, and 13 represent the cumulativefrequencies from the ninth group up to the sixth group.3.4.3 Frequency PolygonA frequency polygon is obtained by linking the midpoints indicated on the x-axis ofthe class intervals from a frequency histogram. A cumulative frequency polygon isderived by connecting the midpoints indicated on the x-axis of the class intervalsfrom a cumulative frequency histogram. Figures 3.11 and 3.12 show the frequencypolygon and the cumulative frequency polygon, respectively, for JNJ’s currentratio. Although a histogram does demonstrate the shape of the data, perhaps theshape can be more clearly illustrated by using a frequency polygon.Data DisplayADV EXP12.5 8.8 11.5 9.1 9.4 10.1 5.3 10.310.2 7.4 8.2 7.8 6.5 9.8 9.2 12.813.9 13.7 9.9 6.8 6.4 7.8 8.5 9.511.3 8.9 6.6 7.5 8.3 13.8 12.9 11.810.4 7.6 8.6 9.4 7.3 9.5 8.3 7.1MTB > STEM AND LEAF USING ‘ADV EXP’Character Stem-and-Leaf DisplayStem-and-leaf of ADV EXP N = 40Leaf Unit = 0.101 5 35 6 456812 7 134568819 8 2335689(8) 9 1244558913 10 12349 11 3586 12 5893 13 789Fig. 3.10 Stem-and-leaf diagram for advertising expenditure using MINITAB80 3 Frequency Distributions and Data Analyses
17. 17. 3.4.4 Pie ChartHistograms are perhaps the graphical forms most commonly used in statistics, butother pictorial forms, such as the pie chart, are often used to present ﬁnancial andmarketing data. For example, Fig. 3.13 depicts a family’s sources of income. Thispie chart indicates that 80 % of this family’s income comes from salary.For data already in frequency form, a pie chart is constructed by converting therelative frequencies of each class into their respective arcs of a circle. For example,a pie chart can be used to represent the student grade distribution data originallypresented in Table 3.3. In Table 3.13, the arcs (in degrees) for the ﬁve slices shownin Fig. 3.14 were obtained by multiplying each relative frequency by 360.Fig. 3.11 Frequency polygon of JNJ’s current ratiosFig. 3.12 Cumulative frequency polygon of JNJ’s current ratios3.4 Graphical Presentation of Frequency Distribution 81
18. 18. 3.5 Further Economic and Business Applications3.5.1 Lorenz CurveThe Lorenz curve, which represents a society’s distribution of income, is a cumula-tive frequency curve used in economics (Fig. 3.15a). The cumulative percentage offamilies (ranked by income) is measured on the x-axis, and the cumulativeFig. 3.13 Sources of family incomeTable 3.13 Gradedistribution for 20 studentsClass Frequency Relative frequency Arc (degrees)Below 60 3 0.15 5460–69 3 0.15 5470–79 6 0.30 10880–89 4 0.20 7290–99 4 0.20 72Total 20 1.00 360Fig. 3.14 Grade distribution pie chart82 3 Frequency Distributions and Data Analyses
19. 19. aCumulative Percentageof Family IncomeCumulative Percentageof Families100908070605040Area IArea II30201000 10 20 30 40 50 60 70 80 90 100BPNHCAOCumulative Percentageof FamiliesCumulative Percentageof Family Income0 10 20 30 40 50 60 70 80 90 1001009080706050403020100bPNHSFig. 3.15 (a) and (b) Lorenz curves3.5 Further Economic and Business Applications 83
20. 20. percentage of family income received is measured on the y-axis. For example,suppose there are 100 families, and each earns \$100 – that is, the distribution ofincome is perfectly equal. The resulting Lorenz curve will be a 45line (OP),because the cumulative percentage of families (e.g., 40 %) and the cumulative shareof family income received are always equal.Now suppose that one family receives 100 % of total family income – that is, theincome distribution is absolutely unequal. The resulting Lorenz curve (ONP)coincides with the x-axis until point N, where there is a discontinuous jump topoint P. This is because, with the exception of that single family (represented bypoint N), each family receives 0 % of total family income. Therefore, thesefamilies’ cumulative share of total family income is also 0 %.The shape the Lorenz curve is most likely to assume is curve H, which liesbetween absolute inequality and equality. This curve indicates that the lowest-income families, who comprise 40 % of families (point A), receive a disproportion-ately small share (about 7 %) of total family income (point C). If every family hadthe same income, the share going to the lowest 40 % would be represented by pointB (40 %).Note that with a more equitable distribution of income, the Lorenz curve is lessbowed, or ﬂatter. Curve S in Fig. 3.15b is the Lorenz curve after a progressiveincome tax is imposed. Because S is ﬂatter than H (which is reproduced fromFig. 3.15a), we can conclude that the distribution of income (after taxes) is morenearly equal than before, as would be expected.One way to measure the inequality of income from the Lorenz curve is to use theGini coefﬁcient.Gini coefficient for curve H ¼area Iarea ðI þ IIÞThe Gini coefﬁcient can range from 0 (perfect equality) to 1 (absolute inequality,wherein one family receives all the income).Examining Fig. 3.15b reveals that the Gini coefﬁcient will be smaller for curve Sthan it is for curve H. In other words, the progressive income tax makes thedistribution of income more nearly equal.3.5.2 Stock and Market Rate of ReturnTable 3.14 presents the frequency tables for the rate of return for Johnson andJohnson, Merck, and the stock market overall. (The data are drawn from Table 2.4in Appendix 2 of Chap. 2.) Because the two ﬁrms have similar frequencydistributions, we can conclude that the performances of Johnson and Johnson andMerck’s stocks have been similar. However, Johnson and Johnson’s highest class is84 3 Frequency Distributions and Data Analyses
21. 21. 0.001–0.200, while Merck’s highest classes are spread but found at À0.200 andbelow and at 0.201–0.400.The stock market’s overall lowest class was found at À0.200 and below, but itshighest class was only 0.001–0.200. Thus, the overall market has ﬂuctuated lessthan the return of the two pharmaceutical ﬁrms. And although Johnson and Johnsonand Merck have a higher top class, the market suffered through fewer negativereturns. Moreover, Johnson and Johnson and Merck had 9 and 8 years, respectively,of losses, while the market had only ﬁve. In other words, the pharmaceutical ﬁrmsoffered the potential of higher returns but also threatened the investor with a greaterrisk of loss.3.5.3 Interest RatesHistograms can be used to summarize movements in such interest rates as the primerate and the treasury bill rate. The prime rate is the interest rate that banks charge totheir best customers; treasury bills are short-term debt instruments issued by the USTable 3.14 Rates of return for JNJ and MRK stock and the SP 500ClassFrequency(years)CumulativefrequencyRelativefrequencyCumulative relativefrequencyJNJÀ0.200 and below 4 4 0.1905 0.1905À0.199 to 0.000 5 9 0.2381 0.42860.001–0.200 5 14 0.2381 0.66670.201–0.400 5 19 0.2381 0.90480.401–0.600 1 20 0.0476 0.95240.601–1.00 1 21 0.0476 1.0000Total 21 1.000MRKÀ0.200 and below 5 5 0.2381 0.2381À0.199 to 0.000 3 8 0.1429 0.38100.001–0.200 3 11 0.1429 0.52380.201–0.400 5 16 0.2381 0.76190.401–0.600 3 19 0.1429 0.90480.601–1.00 2 21 0.0952 1.0000Total 21 1.000SP 500 (market)À0.200 and below 1 1 0.0476 0.0476À0.199 to 0.000 4 5 0.1905 0.23810.001–0.200 11 16 0.5238 0.76190.201–0.400 5 21 0.2381 1.0000Total 21 1.0003.5 Further Economic and Business Applications 85
22. 22. government. Let us examine how these rates have moved over the period1990–2009, as shown in Table 3.15.As can be seen in Table 3.16 and Fig. 3.16, the prime rate is skewed to the right,with 65 % of the observations appearing in the ranges made up of the slightly highermidrange interest rates (6–6.9 %, 7–7.9 %, and 8–8.9 %). If you were to predict afuture value for the prime rate, your best guess would be in the 6–9 % range. Thiswide range would probably not be of much use. Better methods for prediction, suchas multiple regression and time series analysis, will be discussed later (Chaps. 15and 18).Table 3.15 3-Month T-billrate and prime rate(1990–2009)Year 3-Month T-bill rate Prime rate90 7.49 10.0191 5.38 8.4692 3.43 6.2593 3.00 6.0094 4.25 7.1495 5.49 8.8396 5.01 8.2797 5.06 8.4498 4.78 8.3599 4.64 7.9900 5.82 9.2301 3.39 6.9202 1.60 4.6803 1.01 4.1204 1.37 4.3405 3.15 6.1906 4.73 7.9607 4.35 8.0508 1.37 5.0909 0.15 3.25Table 3.16 Frequency distributions of interest ratesT-bill Prime rateClass (%) Frequency Relative frequency Frequency Relative frequency0–1.99 0 0.00 5 0.252–2.99 0 0.00 0 0.003–3.99 1 0.05 4 0.204–4.99 3 0.15 5 0.255–5.99 1 0.05 5 0.256–6.99 4 0.20 0 0.007–7.99 3 0.15 1 0.058–8.99 6 0.30 0 0.009–9.99 1 0.05 0 0.0010–10.99 1 0.05 0 0.00Total 20 1.00 20 1.0086 3 Frequency Distributions and Data Analyses
23. 23. The frequency table for the treasury bill rate is shown in Table 3.16. Thisdistribution, like that of the prime rate, is skewed to the right. Fifty percent of theobservations appear in the third and fourth classes, 4–4.9 % and 5–5.9 %. Thisdistribution is depicted in the histogram shown in Fig. 3.17.If you were to make a prediction of the treasury bill rate, it would probably be inthe 3–6 % range. Again, better methods for predicting observations will bediscussed later.10.009.008.007.006.005.004.003.002.001.000.001 - 2.9Frequency3.0 - 4.9 5.0 - 6.9Interest Rates7.0 - 8.9 9.0 - 11Fig. 3.16 Frequency histogram of prime lending rates given in Table 3.156.005.004.003.002.001.000.000 - 1.9 2 - 2.9 3 - 3.9 4 - 4.9Interest RateFrequency5 - 5.9 6 - 6.9 7 - 7.9Fig. 3.17 Frequency histogram of T-bill rates given in Table 3.153.5 Further Economic and Business Applications 87
24. 24. 3.5.4 Quality ControlFigure 3.18 depicts the quality control data on electronic parts given in Table 3.17.This control chart shows the percentage of defects for each sample lot. Figure 3.18indicates that both lots 5 and 7 have exceeded the allowed maximum defect level of3 %. Therefore, the product quality in these two lots should be improved.Percentage Defective432101 2 3 4 5 6 7 8Lot NumberFig. 3.18 Frequency bar graph of the percentage of defects for each sample lotTable 3.17 Quality controlreport on electronic partsSample Lot Sample Defects Percentage1 1,000 15 1.52 1,000 20 2.03 1,000 17 1.74 1,000 25 2.55 1,000 35 3.56 1,000 20 2.07 1,000 36 3.68 1,000 28 2.8Total 8,000 196 2.45 (mean)88 3 Frequency Distributions and Data Analyses
25. 25. 3.6 SummaryIn this chapter, we extended the discussion of Chap. 2 by showing how data can begrouped to make analysis easier. After the data are grouped, frequency tables,histograms, stem-and-leaf displays, and other graphical techniques are used topresent them in an effective and memorable way.Our ultimate goal is to use a sample to make inferences about a population.Unfortunately, neither the tabular nor the graphical approach lends itself to mea-suring the reliability of an inference in data analysis. To do this, we must developnumerical measures for describing data sets. Therefore, in the next chapter, weshow how data can be described by the use of descriptive statistics such as themean, standard deviation, and other summary statistical measures.Questions and Problems1. Explain the difference between grouped and nongrouped data.2. Explain the difference between frequency and relative frequency.3. Explain the difference between frequency and cumulative frequency.4. Carefully explain how the concept of cumulative frequency can be used to formthe Lorenz curve.5. Suppose you are interested in constructing a frequency distribution for theheights of 80 students in a class. Describe how you would do this.6. What is a frequency polygon? Why is a frequency polygon useful in datapresentation?7. Use the prime rate data given in Table 3.6 in the text to construct cumulativefrequency and cumulative relative frequency tables.8. Use the percentage of total revenue spent on advertising listed in Table 3.12of the text to draw a frequency polygon and a cumulative frequency polygon.9. On November 17, 1991, the Home News of central New Jersey used the barchart given here to show that foreign investors are taxed at a lower rate than theUS citizens.(a) Construct a table to show frequency, relative frequency, and cumulativefrequency.(b) Draw a frequency polygon and a cumulative frequency polygon.Questions and Problems 89
26. 26. Foreign investors get big tax breakson money made in the U.S.1988 tax rates, in percent.Rate formiddle-income* U.S.taxpayers:10.7%. . . If taxed at the samerate. granted Kuwaitiinvestors in the U.S..the American wouldhave paid just\$454An American taxpayerearning \$30,000 to\$40,000 paid\$3,708 . . .Japan6.065.635.214.644.394.393.953.672.502.241.971.861.291.02.76.14CaymanIslandsSpainGreatBritainCanadaFranceItalyNetherlandsSaudiArabiaBelgiumKuwaitNeth.AntillesTaiwanSingaporeFinlandUnitedArabEmiratesSource: Philadelphia Inquirer, Internal Revenue Service.*\$30,000to \$40,000Source: Home News, November 17, 1991, Reprinted by permission of Knighi-Ridder Tribune News10. Use the EPS and DPS data given in Table 2.3 in Chap. 2 to construct frequencydistributions.11. Use the data from question 10 to construct a relative frequency graph and acumulative relative frequency graph for both EPS and DPS.12. On November 17, 1991, the Home News of central New Jersey used the barchart in the accompanying ﬁgure to show the 1980–1991 passenger trafﬁctrends for Newark International Airport.(a) Use these data to draw a line chart and interpret your results.(b) Use these data to draw a stem-and-leaf diagram and interpret your results.90 3 Frequency Distributions and Data Analyses
27. 27. 30Passengers (in millions)252015105080 81 829.210.212.017.423.628.829.423.422.420.922.383 84 85 86 87 88 89 90YearSource: Port Authority of NY and NJ91Source: Home News, November 17, 1991. Reprinted by permission of thepublisher13. An advertising executive is interested in the age distribution of the subscribersto Person magazine. The age distribution is as follows:Age Number of subscribers18–25 10,00026–35 25,00036–45 28,00046–55 19,00056–65 10,000Over 65 7,000(a) Use a frequency distribution graph to present these data.(b) Use a relative frequency distribution to present these data.14. Use the data from question 13 to produce a cumulative frequency graph and acumulative relative frequency graph.Questions and Problems 91