2. Chapter 2:
Exploring Data with Tables and Graphs
2.1 Frequency Distributions for Organizing and
Summarizing Data
2.2 Histograms
2.3 Graphs that Enlighten and Graphs that Deceive
2.4 Scatterplots, Correlation, and Regression
2
Objectives:
1. Organize data using a frequency distribution.
2. Represent data in frequency distributions graphically using histograms, frequency polygons, and ogives.
3. Represent data using bar graphs, Pareto charts, time series graphs, and pie graphs.
4. Draw and interpret a stem and leaf plot.
5. Draw and interpret a scatter plot for a set of paired data.
3. Recall : 2.1 Frequency Distributions for Organizing and Summarizing Data
Data collected in original form is called raw data.
Frequency Distribution (or Frequency Table)
A frequency distribution is the organization of raw data in table form, using
classes and frequencies. It Shows how data are partitioned among several
categories (or classes) by listing the categories along with the number
(frequency) of data values in each of them.
Nominal- or ordinal-level data that can be placed in categories is organized in
categorical frequency distributions.
Lower class limits: The smallest numbers that can belong to each of the
different classes
Upper class limits: The largest numbers that can belong to each of the
different classes
Class boundaries: The numbers used to separate the classes, but without the
gaps created by class limits
Class midpoints: The values in the middle of the classes Each class midpoint
can be found by adding the lower class limit to the upper class limit
and dividing the sum by 2.
Class width: The difference between two consecutive lower class limits in a
frequency distribution
Procedure for Constructing a
Frequency Distribution
1. Select the number of classes,
usually between 5 and 20.
2. Calculate the class width: π =
πππ₯βπππ
# ππ ππππ π ππ
and round up
accordingly.
3. Choose the value for the first
lower class limit by using either
the minimum value or a
convenient value below the
minimum.
4. Using the first lower class limit
and class width, list the other
lower class limits.
5. List the lower class limits in a
vertical column and then
determine and enter the upper
class limits.
6. Take each individual data value
and put a tally mark in the
appropriate class. Add the tally
marks to get the frequency.
3
4. 2.2 Histograms
While a frequency distribution is a useful tool for summarizing data and
investigating the distribution of data, an even better tool is a histogram, which
is a graph that is easier to interpret than a table of numbers.
Key Concept
Histogram:
A graph consisting of bars of equal width drawn adjacent to each other (unless
there are gaps in the data)
The horizontal scale represents classes of quantitative data values, and the vertical
scale represents frequencies. The heights of the bars correspond to frequency values.
Important Uses of a Histogram
β’ Visually displays the shape of the distribution of the data
β’ Shows the location of the center of the data
β’ Shows the spread of the data & Identifies outliers
4
5. 5
Example 1
Construct a histogram to
represent the data for the record
high temperatures for each of
the 50 states.
Class Limits
Class
Boundaries
Frequency
100 - 104
105 - 109
110 - 114
115 - 119
120 - 124
125 - 129
130 - 134
99.5 - 104.5
104.5 - 109.5
109.5 - 114.5
114.5 - 119.5
119.5 - 124.5
124.5 - 129.5
129.5 - 134.5
2
8
18
13
7
1
1
6. Relative Frequency Histogram
Relative Frequency
Histogram has the same shape
and horizontal scale as a
histogram, but the vertical scale
is marked with relative
frequencies instead of actual
frequencies.
Time
(seconds) Frequency
75-124 11
125-174 24
175-224 10
225-274 3
275-324 2
Example 2
Relative
Frequency = f / n
11 / 50 = 0.22
24 / 50 = 0.48
10 / 50 = 0.2
3 / 50 = 0.06
2 / 50 = 0.04
Percent
Frequency
22%
24%
20%
6%
4%
6
οf = 50 οrf = 1
7. Common
Distribution
Shapes
7
Critical Thinking Interpreting Histograms
Explore the data by analyzing the histogram to see what can be learned about
βCVDOTβ: the Center of the data, the Variation, the shape of the Distribution,
whether there are any Outliers, and Time.
8. Normal Distribution
Because this histogram is roughly
bell-shaped, we say that the data
have a normal distribution.
8
Skewness:
A distribution of data is
skewed if it is not symmetric
and extends more to one side
than to the other.
Data skewed to the right
(positively skewed) have a
longer right tail.
Data skewed to the left
(negative skewed) have a
longer left tail.
9. 9
Example 3
Given Histogram: Find the following:
1. Estimate the number of subjects
2. The width
3. Boundaries of the 1st bar
4. Explain the gap.
5. Normal?
Find the following:
1. The number of subjects 12 + 20 + 8 + 0 + 0 + 5 + 18 + 14 + 3 = 80
2. The width; 5.6 β 5.5 = 0.1 g
3. Boundaries of the 1st bar: 5.5g & 5.6g
4. The Gap: The 1st 40 quarters are from a different era than the 2nd set of 40 quarters.
5. Not Normally Distribution
Pre 1964:
90% silver + 10 % copper
Post 1964:
Copper-Nickel alloy
10. Assessing Normality with Normal Quantile Plots
Criteria for Assessing Normality with a Normal Quantile Plot
Normal Distribution: The pattern of the points in the normal quantile plot is reasonably close to a
straight line, and the points do not show some systematic pattern that is not a straight-line pattern.
Not a Normal Distribution: The population distribution is not normal if the normal quantile plot has
either or both of these two conditions:
ο The points do not lie reasonably close to a straight-line pattern.
ο The points show some systematic pattern that is not a straight-line pattern.
10
Normal Distribution: The points are
reasonably close to a straight-line pattern, and
there is no other systematic pattern that is not
a straight-line pattern.
Not a Normal Distribution: The
points do not lie reasonably close
to a straight line.
Normal Quantile Plot:
A normal quantile plot
(or normal probability
plot) is a graph of
points (x, y) where each
x value is from the
original set of sample
data, and each y value
is the corresponding z
score that is expected
from the standard
normal distribution.