1. STAT 3615: BIOLOGICAL STATISTICS Hamdy F. F. Mahmoud, PhD
Collegiate Assistant Professor
Statistics Department @ VTChapter 1: Picturing Distributions with Graphs
1
- Part II
2. • Definition of Statistics
• Individuals and variables
• Two types of data: categorical and quantitative
• Ways to chart categorical data: bar graphs and pie charts
• Ways to chart quantitative data: histograms, dotplots,
and stemplot
• Graphing time series: time plots
In this chapter, we cover
2
3. 3
Numerical or quantitative variables can be presented
graphically using Histogram, Stem-and-leaf plot, or Dot Plot.
Presenting a Numerical Variable in a Graph
4. The great white shark, Carcharodon carcharias, is a large ocean
predator at the top of the food chain. Here are the lengths in
feet of 44 great whites shark:
18.7 12.3 18.6 16.4 15.7 18.3 14.6 15.8 14.9 17.6 12.1
16.4 16.7 17.8 16.2 17.8 13.8 13.8 12.2 15.2 14.7 12.4
13.2 15.8 14.3 16.6 9.4 18.2 13.2 13.6 15.3 16.1 13.5
19.1 16.2 22.8 16.8 13.6 13.2 15.7 19.7 18.7 13.2 16.8
Ø Make a histogram for this data set and comment on it.
§ What is the length distribution of the great white shark?
4
5. Class Count
9 To 11
11 To 13
13 To 15
15 To 17
17 To 19
19 To 21
21 To 23
1
5
12
15
8
2
1
Total 44
The height of this bar is
8 because 8 of the
observations have
values between 17 and
19 feet.
Histogram of the great white shark.
5
6. Step 1: Determine the minimum and maximum values.
Minimum = 9.4
Maximum= 22.8
Step 2: Choose number of classes.
Let us use 7 classes
Step 3: Determine the class interval using this formula.
where k is the number of classes chosen in step 2.
- Always round up, so we can use interval equal 2.
i ≥
Maximum value − Minimum value
k
i ≥
22.8− 9.4
7
=1.91
Steps to make a Histogram:
6
7. Step 5: Draw a histogram uses 7 classes.Step 4: Count the individuals in each class.
Class Count
9 To 11
11 To 13
13 To 15
15 To 17
17 To 19
19 To 21
21 To 23
1
5
12
15
8
2
1
Total 44
The height of this bar is
8 because 8 of the
observations have
values between 17 and
19 feet.
Histogram of the great white shark.
7
8. Number of classes is arbitrary!
Using 10 classesUsing 7 classes
Histogram of the great white shark using different number of
classes.
8
9. § In any graph of data, look for the overall pattern and for
striking deviations from that pattern.
§ You can describe the overall pattern of a histogram by its
shape, center, and spread.
§ An important kind of deviation is an outlier, an
individual value that falls outside the overall pattern.
Describing a histogram
9
11. Unimodal and skewed to right Unimodal and skewed to left
Positively skewed Negatively skewed
Examples
11
12. Patient age (in years) for 241,931 cases of Lyme disease reported in the U.S.
(1992-2006, CDC)
Bimodal distribution
Children
cluster
Adult
cluster
Lyme disease histogram
12
13. The counts show that the midpoint of the distribution is about 14
to 17 feet.
Center of a histogram: shark length distribution
13
14. The spread is from 9.4 to 22.8 feet but …..
Spread of a histogram: shark length distribution
14
15. § Do you think that there are outliers in these data?
Outliers: shark length distribution
15
18. STEMPLOT
v Steps to make a Stemplot:
Ø Separate each observation into a stem, consisting of all but
the final (rightmost) digit, a leaf, the final digit. Stem may
have as many digits as needed, but each leaf contains only a
single digit.
Ø Write the stems in a vertical line at the right of this column.
Ø Write each leaf in the row to the right of its stem, in
increasing order out from the stem.
18
19. Biologists, studying the healing of skin wounds, measured the
rate at which new cells closed a razor cut made in the skin of
an anesthetized newt. Here are the sorted data from 18 newts,
measured in micrometers per hour:
11 12 14 18 22 22 23 23 26
27 28 29 30 33 34 35 35 40
o Draw a stemplot for these data.
19
Healing of skin wounds
21. DOT PLOT
v Steps to draw a Dotplot:
Ø Sort the data set and plot each observation according to its
numerical value along a scaled horizontal axis.
Ø Identical observations are either superimposed (flat dotplot)
or stacked (asymmetrical dotplot).
§ For the previous example, healing of skin wounds, draw a
dotplot.
21
23. TIME PLOT
A time plot of a variable plots each observation against
the time at which it was measured. Always put time on the
horizontal scale of your plot and the variable you are
measuring on the vertical scale. Connecting the data
points by lines helps emphasize any change over time.
23
25. APPLY YOUR KNOWLEDGE
Question #1: Physicians take a sample
of smokers and nonsmokers and record
the time it takes them to fall asleep on
seven consecutive nights. What are the
individuals?
A. The physicians
B. The average time it takes to
fall asleep on the seven nights
C. The smokers and nonsmokers
D. The amount of time it takes to
fall asleep each night
Question #2: A researcher wants to
show the distribution of times it took
subjects who smoke to fall asleep.
Which graphic would be the best
choice?
A. Bar graph
B. Time plot
C. Histogram
D. Pie chart
25
26. A. Skewed right
B. Skewed left
C. Symmetric
D. Bimodal
APPLY YOUR KNOWLEDGE
Question #3: Consider the number of robin’s nests found in 52 locations
throughout a forested area in Vermont. A stemplot of the data appears
below. The shortest leaf is 0.9 and the longest leaf is 7.3. How would
you describe the shape of this distribution?
26
27. A. Increasing
B. Decreasing
C. Flat
D. Varying, going up and down
Question #4: Below is a monthly time plot for parts per million of
arsenic found in groundwater taken from a certain well. How would
you describe the trend?
APPLY YOUR KNOWLEDGE
27