SlideShare a Scribd company logo
1 of 93
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.
Chapter
Describing the
Relation between
Two Variables
4
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.
Section
Scatter Diagrams
and Correlation
4.1
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.
Objectives
1. Draw and interpret scatter diagrams
2. Describe the properties of the linear
correlation coefficient
3. Compute and interpret the linear correlation
coefficient
4. Determine whether a linear relation exists
between two variables
5. Explain the difference between correlation
and causation
4-3
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-4
Objective 1
• Draw and interpret scatter diagrams
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-5
The response variable is the variable whose
value can be explained by the value of the
explanatory or predictor variable.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-6
A scatter diagram is a graph that shows the
relationship between two quantitative variables
measured on the same individual. Each individual
in the data set is represented by a point in the
scatter diagram. The explanatory variable is
plotted on the horizontal axis, and the response
variable is plotted on the vertical axis.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-7
EXAMPLE Drawing and Interpreting a Scatter DiagramEXAMPLE Drawing and Interpreting a Scatter Diagram
The data shown to the right are
based on a study for drilling rock.
The researchers wanted to
determine whether the time it
takes to dry drill a distance of 5
feet in rock increases with the
depth at which the drilling
begins. So, depth at which
drilling begins is the explanatory
variable, x, and time (in minutes)
to drill five feet is the response
variable, y. Draw a scatter
diagram of the data.
Source: Penner, R., and Watts, D.G. “Mining Information.” The American
Statistician, Vol. 45, No. 1, Feb. 1991, p. 6.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-8
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-9
Various Types of Relations in a Scatter Diagram
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-10
Two variables that are linearly related are
positively associated when above-average values
of one variable are associated with above-average
values of the other variable and below-average
values of one variable are associated with below-
average values of the other variable. That is, two
variables are positively associated if, whenever
the value of one variable increases, the value of
the other variable also increases.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-11
Two variables that are linearly related are
negatively associated when above-average
values of one variable are associated with below-
average values of the other variable. That is, two
variables are negatively associated if, whenever
the value of one variable increases, the value of
the other variable decreases.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-12
Objective 2
• Describe the Properties of the Linear
Correlation Coefficient
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-13
The linear correlation coefficient or Pearson
product moment correlation coefficient is a
measure of the strength and direction of the linear
relation between two quantitative variables. The
Greek letter ρ (rho) represents the population
correlation coefficient, and r represents the
sample correlation coefficient. We present only
the formula for the sample correlation coefficient.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-14
Sample Linear Correlation Coefficient
where is the sample mean of the explanatory
variable
sx is the sample standard deviation of the explanatory
variable
is the sample mean of the response variable
sy is the sample standard deviation of the response
variable
n is the number of individuals in the sample
x
y
r =
xi − x
sx




yi − y
sy





∑
n −1
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-15
Properties of the Linear Correlation Coefficient
1.The linear correlation coefficient is always between
–1 and 1, inclusive. That is, –1 ≤ r ≤ 1.
2.If r = + 1, then a perfect positive linear relation
exists between the two variables.
3.If r = –1, then a perfect negative linear relation
exists between the two variables.
4.The closer r is to +1, the stronger is the evidence of
positive association between the two variables.
5.The closer r is to –1, the stronger is the evidence of
negative association between the two variables.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-16
6. If r is close to 0, then little or no evidence exists of
a linear relation between the two variables. So r
close to 0 does not imply no relation, just no linear
relation.
7. The linear correlation coefficient is a unitless
measure of association. So the unit of measure for
x and y plays no role in the interpretation of r.
8. The correlation coefficient is not resistant.
Therefore, an observation that does not follow the
overall pattern of the data could affect the value of
the linear correlation coefficient.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-17
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-18
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-19
Objective 3
• Compute and Interpret the Linear Correlation
Coefficient
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-20
EXAMPLE Determining the Linear Correlation CoefficientEXAMPLE Determining the Linear Correlation Coefficient
Determine the linear
correlation coefficient
of the drilling data.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-21
xi − x
sx
yi − y
sy
xi − x
sx




yi − y
sy






xi − x
sx




yi − y
sy





∑ =x = y =
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-22
r =
xi − x
sx



∑
yi − y
sy






n −1
=
8.501037
12 −1
= 0.773
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-23
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-24
Objective 4
• Determine Whether a Linear Relation Exists
between Two Variables
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-25
Testing for a Linear Relation
Step 1 Determine the absolute value of the
correlation coefficient.
Step 2 Find the critical value in Table II from
Appendix A for the given sample size.
Step 3 If the absolute value of the correlation
coefficient is greater than the critical value, we
say a linear relation exists between the two
variables. Otherwise, no linear relation exists.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-26
EXAMPLE Does a Linear Relation Exist?EXAMPLE Does a Linear Relation Exist?
Determine whether a linear relation exists between time to drill
five feet and depth at which drilling begins. Comment on the type
of relation that appears to exist between time to drill five feet and
depth at which drilling begins.
The correlation between drilling
depth and time to drill is 0.773.
The critical value for n = 12
observations is 0.576. Since
0.773 > 0.576, there is a positive
linear relation between time to
drill five feet and depth at which
drilling begins.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-27
Objective 5
• Explain the Difference between Correlation
and Causation
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-28
According to data obtained from the Statistical
Abstract of the United States, the correlation
between the percentage of the female population
with a bachelor’s degree and the percentage of
births to unmarried mothers since 1990 is 0.940.
Does this mean that a higher percentage of females
with bachelor’s degrees causes a higher percentage
of births to unmarried mothers?
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-29
Certainly not! The correlation exists only because
both percentages have been increasing since 1990.
It is this relation that causes the high correlation.
In general, time series data (data collected over
time) may have high correlations because each
variable is moving in a specific direction over time
(both going up or down over time; one increasing,
while the other is decreasing over time).
When data are observational, we cannot claim a
causal relation exists between two variables. We
can only claim causality when the data are
collected through a designed experiment.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-30
Another way that two variables can be related
even though there is not a causal relation is
through a lurking variable.
A lurking variable is related to both the
explanatory and response variable.
For example, ice cream sales and crime rates have
a very high correlation. Does this mean that local
governments should shut down all ice cream
shops? No! The lurking variable is temperature.
As air temperatures rise, both ice cream sales and
crime rates rise.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-31
EXAMPLE Lurking Variables in a Bone Mineral Density StudyEXAMPLE Lurking Variables in a Bone Mineral Density Study
Because colas tend to replace healthier
beverages and colas contain caffeine
and phosphoric acid, researchers
Katherine L. Tucker and associates
wanted to know whether cola
consumption is associated with lower
bone mineral density in women. The
table lists the typical number of cans of
cola consumed in a week and the
femoral neck bone mineral density for a
sample of 15 women. The data were
collected through a prospective cohort
study.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-32
EXAMPLE Lurking Variables in a Bone Mineral Density StudyEXAMPLE Lurking Variables in a Bone Mineral Density Study
The figure on the next slide shows the scatter diagram of
the data. The correlation between number of colas per week
and bone mineral density is –0.806.The critical value for
correlation with n = 15 from Table II in Appendix A is
0.514. Because |–0.806| > 0.514, we conclude a negative
linear relation exists between number of colas consumed
and bone mineral density. Can the authors conclude that an
increase in the number of colas consumed causes a decrease
in bone mineral density? Identify some lurking variables in
the study.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-33
EXAMPLE Lurking Variables in a Bone Mineral Density StudyEXAMPLE Lurking Variables in a Bone Mineral Density Study
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-34
EXAMPLE Lurking Variables in a Bone Mineral Density StudyEXAMPLE Lurking Variables in a Bone Mineral Density Study
In prospective cohort studies, data are collected on a
group of subjects through questionnaires and surveys
over time. Therefore, the data are observational. So the
researchers cannot claim that increased cola consumption
causes a decrease in bone mineral density.
Some lurking variables in the study that could confound
the results are:
• body mass index
• height
• smoking
• alcohol consumption
• calcium intake
• physical activity
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-35
EXAMPLE Lurking Variables in a Bone Mineral Density StudyEXAMPLE Lurking Variables in a Bone Mineral Density Study
The authors were careful to say that increased cola
consumption is associated with lower bone mineral
density because of potential lurking variables. They
never stated that increased cola consumption causes
lower bone mineral density.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.
Section
Least-squares
Regression
4.2
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-37
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.
Objectives
1. Find the least-squares regression line and use
the line to make predictions
2. Interpret the slope and the y-intercept of the
least-squares regression line
3. Compute the sum of squared residuals
4-38
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-39
Using the following sample data:
Using (2, 5.7) and (6, 1.9):
m =
5.7 −1.9
2 − 6
= −0.95
y − y1 = m x − x1( )
y − 5.7 = −0.95 x − 2( )
y − 5.7 = −0.95x +1.9
y = −0.95x + 7.6
EXAMPLE Finding an Equation that Describes Linearly
Relate Data
EXAMPLE Finding an Equation that Describes Linearly
Relate Data
(a) Find a linear equation that relates x (the
explanatory variable) and y (the response variable) by
selecting two points and finding the equation of the
line containing the points.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-40
(b) Graph the equation on the scatter diagram.
(c) Use the equation to predict
y if x = 3.
y = −0.95x + 7.6
= −0.95(3) + 7.6
= 4.75
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-41
Objective 1
• Find the least-squares regression line and use
the line to make predictions
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-42
}
(3, 5.2)
residual = observed y – predicted y
= 5.2 – 4.75
= 0.45
The difference between the observed value of y and
the predicted value of y is the error, or residual.
Using the line from the last example, and the
predicted value at x = 3:
residual = observed y – predicted y
= 5.2 – 4.75
= 0.45
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-43
Least-Squares Regression Criterion
The least-squares regression line is the line
that minimizes the sum of the squared errors (or
residuals). This line minimizes the sum of the
squared vertical distance between the observed
values of y and those predicted by the line ,
(“y-hat”). We represent this as
“ minimize Σ residuals2
”.
ˆy
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-44
The Least-Squares Regression Line
The equation of the least-squares regression line is
given by
ˆy = b1x + b0
where
where
is the slope of the least-squares
regression line
b1 = r ⋅
sy
sx
b0 = y − b1x is the y-intercept of the least-
squares regression line
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-45
The Least-Squares Regression Line
Note: is the sample mean and sx is the sample
standard deviation of the explanatory variable x ;
is the sample mean and sy is the sample
standard deviation of the response variable y.
y
x
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-46
EXAMPLE Finding the Least-squares Regression LineEXAMPLE Finding the Least-squares Regression Line
Using the drilling data
(a)Find the least-squares
regression line.
(b)Predict the drilling time if
drilling starts at 130 feet.
(c)Is the observed drilling
time at 130 feet above, or
below, average.
(d)Draw the least-squares
regression line on the scatter
diagram of the data.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-47
(a) We agree to round the estimates of the
slope and intercept to four decimal places.
(b)
(c) The observed drilling time is 6.93
seconds. The predicted drilling time is
7.035 seconds. The drilling time of 6.93
seconds is below average.
ˆy = 0.0116x + 5.5273
ˆy = 0.0116x + 5.5273
= 0.0116(130) + 5.5273
= 7.035
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-48
(d)
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-49
Objective 2
• Interpret the Slope and the y-intercept of the
Least-Squares Regression Line
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-50
Interpretation of Slope:
The slope of the regression line is 0.0116. For each
additional foot of depth we start drilling, the time to
drill five feet increases by 0.0116 minutes, on average.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-51
Interpretation of the y-Intercept: The y-intercept of
the regression line is 5.5273. To interpret the y-
intercept, we must first ask two questions:
1. Is 0 a reasonable value for the explanatory variable?
2. Do any observations near x = 0 exist in the data set?
A value of 0 is reasonable for the drilling data (this
indicates that drilling begins at the surface of Earth.
The smallest observation in the data set is x = 35 feet,
which is reasonably close to 0. So, interpretation of the
y-intercept is reasonable.
The time to drill five feet when we begin drilling at the
surface of Earth is 5.5273 minutes.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-52
If the least-squares regression line is used to make
predictions based on values of the explanatory
variable that are much larger or much smaller than
the observed values, we say the researcher is
working outside the scope of the model. Never use
a least-squares regression line to make predictions
outside the scope of the model because we can’t be
sure the linear relation continues to exist.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-53
Objective 3
• Compute the Sum of Squared Residuals
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-54
To illustrate the fact that the sum of squared
residuals for a least-squares regression line
is less than the sum of squared residuals for
any other line, use the “regression by eye”
applet.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.
Section
The Coefficient of
Determination
4.3
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-56
Objective 1
• Compute and Interpret the Coefficient of
Determination
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-57
The coefficient of determination, R2
,
measures the proportion of total variation in
the response variable that is explained by the
least-squares regression line.
The coefficient of determination is a number
between 0 and 1, inclusive. That is, 0 < R2
< 1.
If R2
= 0 the line has no explanatory value
If R2
= 1 means the line explains 100% of the
variation in the response variable.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-58
The data to the right are based
on a study for drilling rock.
The researchers wanted to
determine whether the time it
takes to dry drill a distance of
5 feet in rock increases with
the depth at which the drilling
begins. So, depth at which
drilling begins is the predictor
variable, x, and time (in
minutes) to drill five feet is
the response variable, y.
Source: Penner, R., and Watts, D.G. “Mining Information.”
The American Statistician, Vol. 45, No. 1, Feb. 1991, p. 6.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-59
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-60
Regression Analysis
The regression equation is
Time = 5.53 + 0.0116 Depth
Sample Statistics
Mean Standard Deviation
Depth 126.2 52.2
Time 6.99 0.781
Correlation Between Depth and Time: 0.773
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-61
Suppose we were asked to predict the time to
drill an additional 5 feet, but we did not
know the current depth of the drill. What
would be our best “guess”?
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-62
Suppose we were asked to predict the time to
drill an additional 5 feet, but we did not
know the current depth of the drill. What
would be our best “guess”?
ANSWER:
The mean time to drill an additional 5 feet:
6.99 minutes
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-63
Now suppose that we are asked to predict the
time to drill an additional 5 feet if the current
depth of the drill is 160 feet?
ANSWER:
Our “guess” increased from 6.99 minutes to
7.39 minutes based on the knowledge that drill
depth is positively associated with drill time.
ˆy = 5.53+ 0.0116(160) = 7.39
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-64
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-65
The difference between the observed
value of the response variable and the
mean value of the response variable is
called the total deviation and is equal to
y − y
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-66
The difference between the predicted value
of the response variable and the mean value
of the response variable is called the
explained deviation and is equal to
ˆy − y
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-67
The difference between the observed
value of the response variable and the
predicted value of the response variable is
called the unexplained deviation and is
equal to
y − ˆy
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-68
Total
Deviation
y − y = y − ˆy( ) + ˆy − y( )
Unexplained
Deviation
Explained
Deviation
+=
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-69
Total
Deviation
y − y( )2
∑ = y − ˆy( )2
∑ + ˆy − y( )2
∑
Unexplained
Deviation
Explained
Deviation
+=
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-70
Total Variation = Unexplained Variation + Explained Variation
1 =
Unexplained Variation Explained Variation
Unexplained VariationExplained Variation
Total Variation Total Variation
Total VariationTotal Variation
+
= 1 –R2
=
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-71
To determine R2
for the linear regression model simply
square the value of the linear correlation coefficient.
To determine R2
for the linear regression model simply
square the value of the linear correlation coefficient.
Squaring the linear correlation coefficient to obtain
the coefficient of determination works only for the
least-squares linear regression model
The method does not work in general.
ˆy = b1x + b0
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-72
EXAMPLE Determining the Coefficient of DeterminationEXAMPLE Determining the Coefficient of Determination
Find and interpret the coefficient of
determination for the drilling data.
Because the linear correlation coefficient, r, is
0.773, we have that
R2
= 0.7732
= 0.5975 = 59.75%.
So, 59.75% of the variability in drilling time is
explained by the least-squares regression line.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-73
Draw a scatter diagram for each of these data
sets. For each data set, the variance of y is 17.49.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-74
Data Set A Data Set B Data Set C
Data Set A: 99.99% of the variability in y is explained
by the least-squares regression line
Data Set B: 94.7% of the variability in y is explained by
the least-squares regression line
Data Set C: 9.4% of the variability in y is explained by
the least-squares regression line
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.
Section
Contingency
Tables and
Association
4.4
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.
Objectives
1. Compute the marginal distribution of a
variable
2. Use the conditional distribution to identify
association among categorical data
3. Explain Simpson’s Paradox
4-76
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-77
A professor at a community college in New Mexico conducted
a study to assess the effectiveness of delivering an introductory
statistics course via traditional lecture-based method, online
delivery (no classroom instruction), and hybrid instruction
(online course with weekly meetings) methods, the grades
students received in each of the courses were tallied.
The table is referred to as a contingency table, or two-way
table, because it relates two categories of data. The row
variable is grade, because each row in the table describes the
grade received for each group. The column variable is delivery
method. Each box inside the table is referred to as a cell.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-78
Objective 1
• Compute the Marginal Distribution of a
Variable
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-79
A marginal distribution of a variable is a
frequency or relative frequency distribution of
either the row or column variable in the
contingency table.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-80
EXAMPLE Determining Frequency Marginal DistributionsEXAMPLE Determining Frequency Marginal Distributions
A professor at a community college in New Mexico
conducted a study to assess the effectiveness of delivering an
introductory statistics course via traditional lecture-based
method, online delivery (no classroom instruction), and
hybrid instruction (online course with weekly meetings)
methods, the grades students received in each of the courses
were tallied. Find the frequency marginal distributions for
course grade and delivery method.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-81
EXAMPLE Determining Relative Frequency Marginal DistributionsEXAMPLE Determining Relative Frequency Marginal Distributions
Determine the relative frequency marginal distribution for
course grade and delivery method.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-82
Objective 2
• Use the Conditional Distribution to Identify
Association among Categorical Data
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-83
A conditional distribution lists the relative
frequency of each category of the response
variable, given a specific value of the
explanatory variable in the contingency table.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-84
EXAMPLE Determining a Conditional DistributionEXAMPLE Determining a Conditional Distribution
Construct a conditional
distribution of course
grade by method of
delivery. Comment on
any type of association
that may exist between
course grade and
delivery method.
It appears that students in the hybrid course are more
likely to pass (A, B, or C) than the other two methods.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-85
EXAMPLE Drawing a Bar Graph of a Conditional DistributionEXAMPLE Drawing a Bar Graph of a Conditional Distribution
Using the results of the previous example, draw
a bar graph that represents the conditional
distribution of method of delivery by grade
earned.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-86
The following contingency table shows the
survival status and demographics of passengers
on the ill-fated Titanic.
Draw a conditional
bar graph of
survival status by
demographic
characteristic.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-87
Objective 1
• Explain Simpson’s Paradox
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-88
Insulin dependent (or Type 1) diabetes is a disease that
results in the permanent destruction of insulin-
producing beta cells of the pancreas. Type 1 diabetes
is lethal unless treatment with insulin injections
replaces the missing hormone. Individuals with
insulin independent (or Type 2) diabetes can produce
insulin internally. The data shown in the table below
represent the survival status of 902 patients with
diabetes by type over a 5-year period.
Type 1 Type 2 Total
Survived 253 326 579
Died 105 218 323
358 544 902
EXAMPLE Illustrating Simpson’s ParadoxEXAMPLE Illustrating Simpson’s Paradox
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-89
EXAMPLE Illustrating Simpson’s ParadoxEXAMPLE Illustrating Simpson’s Paradox
From the table, the proportion of patients with
Type 1 diabetes who died was 105/358 = 0.29;
the proportion of patients with Type 2 diabetes
who died was 218/544 = 0.40. Based on this, we
might conclude that Type 2 diabetes is more
lethal than Type 1 diabetes.
Type 1 Type 2 Total
Survived 253 326 579
Died 105 218 323
358 544 902
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-90
However, Type 2 diabetes is usually contracted after the
age of 40. If we account for the variable age and divide
our patients into two groups (those 40 or younger and
those over 40), we obtain the data in the table below.
Type 1 Type 2 Total
< 40 > 40 < 40 > 40
Survived 129 124 15 311 579
Died 1 104 0 218 323
130 228 15 529 902
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-91
Of the diabetics 40 years of age or younger, the
proportion of those with Type 1 diabetes who died is
1/130 = 0.008; the proportion of those with Type 2
diabetes who died is 0/15 = 0.
Type 1 Type 2 Total
< 40 > 40 < 40 > 40
Survived 129 124 15 311 579
Died 1 104 0 218 323
130 228 15 529 902
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-92
Of the diabetics over 40 years of age, the proportion
of those with Type 1 diabetes who died is 104/228 =
0.456; the proportion of those with Type 2 diabetes
who died is 218/529 = 0.412.
The lurking variable age led us to believe that Type 2
diabetes is the more dangerous type of diabetes.
Type 1 Type 2 Total
< 40 > 40 < 40 > 40
Survived 129 124 15 311 579
Died 1 104 0 218 323
130 228 15 529 902
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-93
Simpson’s Paradox represents a situation in
which an association between two variables
inverts or goes away when a third variable is
introduced to the analysis.

More Related Content

Viewers also liked

Topic 15 correlation spss
Topic 15 correlation spssTopic 15 correlation spss
Topic 15 correlation spssSizwan Ahammed
 
Correlation & Linear Regression
Correlation & Linear RegressionCorrelation & Linear Regression
Correlation & Linear RegressionAzmi Mohd Tamil
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regressionMohit Asija
 
0601004 final project ajay
0601004 final project ajay0601004 final project ajay
0601004 final project ajaySupa Buoy
 
Indianairportssectorreportaugust2013 140120043048-phpapp01
Indianairportssectorreportaugust2013 140120043048-phpapp01Indianairportssectorreportaugust2013 140120043048-phpapp01
Indianairportssectorreportaugust2013 140120043048-phpapp01Supa Buoy
 
India digital-future-in-focus-2013
India digital-future-in-focus-2013India digital-future-in-focus-2013
India digital-future-in-focus-2013Supa Buoy
 
0601030 financial analysis
0601030 financial analysis0601030 financial analysis
0601030 financial analysisSupa Buoy
 
Circuits i maquines:Eric i Maria
Circuits i maquines:Eric i MariaCircuits i maquines:Eric i Maria
Circuits i maquines:Eric i Marialagessera
 
Working capital management
Working capital managementWorking capital management
Working capital managementSupa Buoy
 
Circuits elèctrics: Manel i Dèlia
Circuits elèctrics: Manel i DèliaCircuits elèctrics: Manel i Dèlia
Circuits elèctrics: Manel i Dèlialagessera
 
Kidney Disease Screening in Fiji - Dr Angus Ritchie
Kidney Disease Screening in Fiji - Dr Angus RitchieKidney Disease Screening in Fiji - Dr Angus Ritchie
Kidney Disease Screening in Fiji - Dr Angus RitchieAngus Ritchie
 
0601099 market survey & agency development
0601099 market survey & agency development0601099 market survey & agency development
0601099 market survey & agency developmentSupa Buoy
 
Fem un mural de la bruixeta fa bondat
Fem un mural de la bruixeta fa bondatFem un mural de la bruixeta fa bondat
Fem un mural de la bruixeta fa bondatlagessera
 
0601015 training & development
0601015 training & development0601015 training & development
0601015 training & developmentSupa Buoy
 
The mindmapbook
The mindmapbookThe mindmapbook
The mindmapbooksooner123
 
0601036 causes of attrition amongst the local and outside workers
0601036 causes of attrition amongst the local and outside workers0601036 causes of attrition amongst the local and outside workers
0601036 causes of attrition amongst the local and outside workersSupa Buoy
 
Behavioural Meetup: Guy Fielding
Behavioural Meetup: Guy FieldingBehavioural Meetup: Guy Fielding
Behavioural Meetup: Guy Fieldingbehavioural
 
0601079 advertising and brand image
0601079 advertising and brand image0601079 advertising and brand image
0601079 advertising and brand imageSupa Buoy
 

Viewers also liked (20)

Les5e ppt 09
Les5e ppt 09Les5e ppt 09
Les5e ppt 09
 
Measuring relationships
Measuring relationshipsMeasuring relationships
Measuring relationships
 
Topic 15 correlation spss
Topic 15 correlation spssTopic 15 correlation spss
Topic 15 correlation spss
 
Correlation & Linear Regression
Correlation & Linear RegressionCorrelation & Linear Regression
Correlation & Linear Regression
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regression
 
0601004 final project ajay
0601004 final project ajay0601004 final project ajay
0601004 final project ajay
 
Indianairportssectorreportaugust2013 140120043048-phpapp01
Indianairportssectorreportaugust2013 140120043048-phpapp01Indianairportssectorreportaugust2013 140120043048-phpapp01
Indianairportssectorreportaugust2013 140120043048-phpapp01
 
India digital-future-in-focus-2013
India digital-future-in-focus-2013India digital-future-in-focus-2013
India digital-future-in-focus-2013
 
0601030 financial analysis
0601030 financial analysis0601030 financial analysis
0601030 financial analysis
 
Circuits i maquines:Eric i Maria
Circuits i maquines:Eric i MariaCircuits i maquines:Eric i Maria
Circuits i maquines:Eric i Maria
 
Working capital management
Working capital managementWorking capital management
Working capital management
 
Circuits elèctrics: Manel i Dèlia
Circuits elèctrics: Manel i DèliaCircuits elèctrics: Manel i Dèlia
Circuits elèctrics: Manel i Dèlia
 
Kidney Disease Screening in Fiji - Dr Angus Ritchie
Kidney Disease Screening in Fiji - Dr Angus RitchieKidney Disease Screening in Fiji - Dr Angus Ritchie
Kidney Disease Screening in Fiji - Dr Angus Ritchie
 
0601099 market survey & agency development
0601099 market survey & agency development0601099 market survey & agency development
0601099 market survey & agency development
 
Fem un mural de la bruixeta fa bondat
Fem un mural de la bruixeta fa bondatFem un mural de la bruixeta fa bondat
Fem un mural de la bruixeta fa bondat
 
0601015 training & development
0601015 training & development0601015 training & development
0601015 training & development
 
The mindmapbook
The mindmapbookThe mindmapbook
The mindmapbook
 
0601036 causes of attrition amongst the local and outside workers
0601036 causes of attrition amongst the local and outside workers0601036 causes of attrition amongst the local and outside workers
0601036 causes of attrition amongst the local and outside workers
 
Behavioural Meetup: Guy Fielding
Behavioural Meetup: Guy FieldingBehavioural Meetup: Guy Fielding
Behavioural Meetup: Guy Fielding
 
0601079 advertising and brand image
0601079 advertising and brand image0601079 advertising and brand image
0601079 advertising and brand image
 

Similar to Sfs4e ppt 04

Aron chpt 3 correlation
Aron chpt 3 correlationAron chpt 3 correlation
Aron chpt 3 correlationKaren Price
 
Aron chpt 3 correlation compatability version f2011
Aron chpt 3 correlation compatability version f2011Aron chpt 3 correlation compatability version f2011
Aron chpt 3 correlation compatability version f2011Sandra Nicks
 
Stats For Life Module7 Oc
Stats For Life Module7 OcStats For Life Module7 Oc
Stats For Life Module7 OcN Rabe
 
Exercise 14Understanding Simple Linear RegressionStatistical Tec.docx
Exercise 14Understanding Simple Linear RegressionStatistical Tec.docxExercise 14Understanding Simple Linear RegressionStatistical Tec.docx
Exercise 14Understanding Simple Linear RegressionStatistical Tec.docxAlleneMcclendon878
 
Unit ibp801 t l multiple correlation a24022022
Unit ibp801 t  l multiple correlation a24022022Unit ibp801 t  l multiple correlation a24022022
Unit ibp801 t l multiple correlation a24022022ashish7sattee
 
Statistics- chapter4.pdf
Statistics- chapter4.pdfStatistics- chapter4.pdf
Statistics- chapter4.pdfPaulosMekuria
 
Data Processing and Statistical Treatment: Spreads and Correlation
Data Processing and Statistical Treatment: Spreads and CorrelationData Processing and Statistical Treatment: Spreads and Correlation
Data Processing and Statistical Treatment: Spreads and CorrelationJanet Penilla
 
12.1 Korelasi (1).pptx
12.1 Korelasi (1).pptx12.1 Korelasi (1).pptx
12.1 Korelasi (1).pptxssusereb82c1
 
Correlation- an introduction and application of spearman rank correlation by...
Correlation- an introduction and application of spearman rank correlation  by...Correlation- an introduction and application of spearman rank correlation  by...
Correlation- an introduction and application of spearman rank correlation by...Gunjan Verma
 
Correlation and Regression.pdf
Correlation and Regression.pdfCorrelation and Regression.pdf
Correlation and Regression.pdfAadarshSah1
 
Null hypothesis for pearson correlation (conceptual)
Null hypothesis for pearson correlation (conceptual)Null hypothesis for pearson correlation (conceptual)
Null hypothesis for pearson correlation (conceptual)CTLTLA
 
Null hypothesis for pearson correlation
Null hypothesis for pearson correlationNull hypothesis for pearson correlation
Null hypothesis for pearson correlationKen Plummer
 
Ebd1 lecture 8 2010
Ebd1  lecture 8 2010Ebd1  lecture 8 2010
Ebd1 lecture 8 2010Reko Kemo
 
Null hypothesis for Pearson Correlation (independence)
Null hypothesis for Pearson Correlation (independence)Null hypothesis for Pearson Correlation (independence)
Null hypothesis for Pearson Correlation (independence)Ken Plummer
 

Similar to Sfs4e ppt 04 (20)

Chapter 01 ncc STAT
Chapter 01 ncc STAT Chapter 01 ncc STAT
Chapter 01 ncc STAT
 
Aron chpt 3 correlation
Aron chpt 3 correlationAron chpt 3 correlation
Aron chpt 3 correlation
 
Aron chpt 3 correlation compatability version f2011
Aron chpt 3 correlation compatability version f2011Aron chpt 3 correlation compatability version f2011
Aron chpt 3 correlation compatability version f2011
 
Stats For Life Module7 Oc
Stats For Life Module7 OcStats For Life Module7 Oc
Stats For Life Module7 Oc
 
Exercise 14Understanding Simple Linear RegressionStatistical Tec.docx
Exercise 14Understanding Simple Linear RegressionStatistical Tec.docxExercise 14Understanding Simple Linear RegressionStatistical Tec.docx
Exercise 14Understanding Simple Linear RegressionStatistical Tec.docx
 
Unit ibp801 t l multiple correlation a24022022
Unit ibp801 t  l multiple correlation a24022022Unit ibp801 t  l multiple correlation a24022022
Unit ibp801 t l multiple correlation a24022022
 
Statistics- chapter4.pdf
Statistics- chapter4.pdfStatistics- chapter4.pdf
Statistics- chapter4.pdf
 
Data Processing and Statistical Treatment: Spreads and Correlation
Data Processing and Statistical Treatment: Spreads and CorrelationData Processing and Statistical Treatment: Spreads and Correlation
Data Processing and Statistical Treatment: Spreads and Correlation
 
Correltional research
Correltional researchCorreltional research
Correltional research
 
12.1 Korelasi (1).pptx
12.1 Korelasi (1).pptx12.1 Korelasi (1).pptx
12.1 Korelasi (1).pptx
 
Correlation- an introduction and application of spearman rank correlation by...
Correlation- an introduction and application of spearman rank correlation  by...Correlation- an introduction and application of spearman rank correlation  by...
Correlation- an introduction and application of spearman rank correlation by...
 
Correlation and Regression.pdf
Correlation and Regression.pdfCorrelation and Regression.pdf
Correlation and Regression.pdf
 
Null hypothesis for pearson correlation (conceptual)
Null hypothesis for pearson correlation (conceptual)Null hypothesis for pearson correlation (conceptual)
Null hypothesis for pearson correlation (conceptual)
 
Correlational_Analysis.pptx
Correlational_Analysis.pptxCorrelational_Analysis.pptx
Correlational_Analysis.pptx
 
Null hypothesis for pearson correlation
Null hypothesis for pearson correlationNull hypothesis for pearson correlation
Null hypothesis for pearson correlation
 
Ebd1 lecture 8 2010
Ebd1  lecture 8 2010Ebd1  lecture 8 2010
Ebd1 lecture 8 2010
 
Module 4 -_logistic_regression
Module 4 -_logistic_regressionModule 4 -_logistic_regression
Module 4 -_logistic_regression
 
4. correlations
4. correlations4. correlations
4. correlations
 
Null hypothesis for Pearson Correlation (independence)
Null hypothesis for Pearson Correlation (independence)Null hypothesis for Pearson Correlation (independence)
Null hypothesis for Pearson Correlation (independence)
 
Analyzing Student Debt with R
Analyzing Student Debt with R Analyzing Student Debt with R
Analyzing Student Debt with R
 

More from Uconn Stamford (20)

Chapter 11 (1)
Chapter 11 (1)Chapter 11 (1)
Chapter 11 (1)
 
Ch17 lecture0
Ch17 lecture0Ch17 lecture0
Ch17 lecture0
 
Schaefer c14 (1)
Schaefer c14 (1)Schaefer c14 (1)
Schaefer c14 (1)
 
Schaefer c11 (1)
Schaefer c11 (1)Schaefer c11 (1)
Schaefer c11 (1)
 
Schaefer c10(1)
Schaefer c10(1)Schaefer c10(1)
Schaefer c10(1)
 
Schaefer c4
Schaefer c4Schaefer c4
Schaefer c4
 
Ch04 lecture (2)
Ch04 lecture (2)Ch04 lecture (2)
Ch04 lecture (2)
 
Schaefer c9 (1)
Schaefer c9 (1)Schaefer c9 (1)
Schaefer c9 (1)
 
Schaefer c10 (1)
Schaefer c10 (1)Schaefer c10 (1)
Schaefer c10 (1)
 
Ch13 prs
Ch13 prsCh13 prs
Ch13 prs
 
Ch12 prs
Ch12 prsCh12 prs
Ch12 prs
 
Ch13 lecture0 (1)
Ch13 lecture0 (1)Ch13 lecture0 (1)
Ch13 lecture0 (1)
 
Ch12 lecture0 (8)
Ch12 lecture0 (8)Ch12 lecture0 (8)
Ch12 lecture0 (8)
 
Ma ch 11 fiscal policy (1)
Ma ch 11 fiscal policy (1)Ma ch 11 fiscal policy (1)
Ma ch 11 fiscal policy (1)
 
Ma ch 13 money and the financial system (1)
Ma ch 13 money and the financial system (1)Ma ch 13 money and the financial system (1)
Ma ch 13 money and the financial system (1)
 
Ma ch 14 banking and the money supply (1)
Ma ch 14 banking and the money supply (1)Ma ch 14 banking and the money supply (1)
Ma ch 14 banking and the money supply (1)
 
Ch07 prs0
Ch07 prs0Ch07 prs0
Ch07 prs0
 
Ch08 prs0
Ch08 prs0Ch08 prs0
Ch08 prs0
 
Foreign born population profile
Foreign born population profileForeign born population profile
Foreign born population profile
 
Ch12 lecture0 (4)
Ch12 lecture0 (4)Ch12 lecture0 (4)
Ch12 lecture0 (4)
 

Recently uploaded

Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...amitlee9823
 
Organizational Transformation Lead with Culture
Organizational Transformation Lead with CultureOrganizational Transformation Lead with Culture
Organizational Transformation Lead with CultureSeta Wicaksana
 
Business Model Canvas (BMC)- A new venture concept
Business Model Canvas (BMC)-  A new venture conceptBusiness Model Canvas (BMC)-  A new venture concept
Business Model Canvas (BMC)- A new venture conceptP&CO
 
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptxB.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptxpriyanshujha201
 
Call Girls From Pari Chowk Greater Noida ❤️8448577510 ⊹Best Escorts Service I...
Call Girls From Pari Chowk Greater Noida ❤️8448577510 ⊹Best Escorts Service I...Call Girls From Pari Chowk Greater Noida ❤️8448577510 ⊹Best Escorts Service I...
Call Girls From Pari Chowk Greater Noida ❤️8448577510 ⊹Best Escorts Service I...lizamodels9
 
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...lizamodels9
 
BAGALUR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRL
BAGALUR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRLBAGALUR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRL
BAGALUR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRLkapoorjyoti4444
 
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdfRenandantas16
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756dollysharma2066
 
Call Girls In Noida 959961⊹3876 Independent Escort Service Noida
Call Girls In Noida 959961⊹3876 Independent Escort Service NoidaCall Girls In Noida 959961⊹3876 Independent Escort Service Noida
Call Girls In Noida 959961⊹3876 Independent Escort Service Noidadlhescort
 
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...Dipal Arora
 
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876dlhescort
 
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service AvailableDipal Arora
 
Katrina Personal Brand Project and portfolio 1
Katrina Personal Brand Project and portfolio 1Katrina Personal Brand Project and portfolio 1
Katrina Personal Brand Project and portfolio 1kcpayne
 
A DAY IN THE LIFE OF A SALESMAN / WOMAN
A DAY IN THE LIFE OF A  SALESMAN / WOMANA DAY IN THE LIFE OF A  SALESMAN / WOMAN
A DAY IN THE LIFE OF A SALESMAN / WOMANIlamathiKannappan
 
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...Sheetaleventcompany
 
It will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 MayIt will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 MayNZSG
 
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRLMONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRLSeo
 

Recently uploaded (20)

Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
 
Organizational Transformation Lead with Culture
Organizational Transformation Lead with CultureOrganizational Transformation Lead with Culture
Organizational Transformation Lead with Culture
 
Business Model Canvas (BMC)- A new venture concept
Business Model Canvas (BMC)-  A new venture conceptBusiness Model Canvas (BMC)-  A new venture concept
Business Model Canvas (BMC)- A new venture concept
 
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptxB.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
 
Call Girls From Pari Chowk Greater Noida ❤️8448577510 ⊹Best Escorts Service I...
Call Girls From Pari Chowk Greater Noida ❤️8448577510 ⊹Best Escorts Service I...Call Girls From Pari Chowk Greater Noida ❤️8448577510 ⊹Best Escorts Service I...
Call Girls From Pari Chowk Greater Noida ❤️8448577510 ⊹Best Escorts Service I...
 
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
 
BAGALUR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRL
BAGALUR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRLBAGALUR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRL
BAGALUR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRL
 
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabiunwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
 
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
Call Girls In Noida 959961⊹3876 Independent Escort Service Noida
Call Girls In Noida 959961⊹3876 Independent Escort Service NoidaCall Girls In Noida 959961⊹3876 Independent Escort Service Noida
Call Girls In Noida 959961⊹3876 Independent Escort Service Noida
 
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
 
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
Call Girls in Delhi, Escort Service Available 24x7 in Delhi 959961-/-3876
 
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
 
Katrina Personal Brand Project and portfolio 1
Katrina Personal Brand Project and portfolio 1Katrina Personal Brand Project and portfolio 1
Katrina Personal Brand Project and portfolio 1
 
A DAY IN THE LIFE OF A SALESMAN / WOMAN
A DAY IN THE LIFE OF A  SALESMAN / WOMANA DAY IN THE LIFE OF A  SALESMAN / WOMAN
A DAY IN THE LIFE OF A SALESMAN / WOMAN
 
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
 
It will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 MayIt will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 May
 
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRLMONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
 

Sfs4e ppt 04

  • 1. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc. Chapter Describing the Relation between Two Variables 4
  • 2. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc. Section Scatter Diagrams and Correlation 4.1
  • 3. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc. Objectives 1. Draw and interpret scatter diagrams 2. Describe the properties of the linear correlation coefficient 3. Compute and interpret the linear correlation coefficient 4. Determine whether a linear relation exists between two variables 5. Explain the difference between correlation and causation 4-3
  • 4. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-4 Objective 1 • Draw and interpret scatter diagrams
  • 5. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-5 The response variable is the variable whose value can be explained by the value of the explanatory or predictor variable.
  • 6. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-6 A scatter diagram is a graph that shows the relationship between two quantitative variables measured on the same individual. Each individual in the data set is represented by a point in the scatter diagram. The explanatory variable is plotted on the horizontal axis, and the response variable is plotted on the vertical axis.
  • 7. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-7 EXAMPLE Drawing and Interpreting a Scatter DiagramEXAMPLE Drawing and Interpreting a Scatter Diagram The data shown to the right are based on a study for drilling rock. The researchers wanted to determine whether the time it takes to dry drill a distance of 5 feet in rock increases with the depth at which the drilling begins. So, depth at which drilling begins is the explanatory variable, x, and time (in minutes) to drill five feet is the response variable, y. Draw a scatter diagram of the data. Source: Penner, R., and Watts, D.G. “Mining Information.” The American Statistician, Vol. 45, No. 1, Feb. 1991, p. 6.
  • 8. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-8
  • 9. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-9 Various Types of Relations in a Scatter Diagram
  • 10. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-10 Two variables that are linearly related are positively associated when above-average values of one variable are associated with above-average values of the other variable and below-average values of one variable are associated with below- average values of the other variable. That is, two variables are positively associated if, whenever the value of one variable increases, the value of the other variable also increases.
  • 11. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-11 Two variables that are linearly related are negatively associated when above-average values of one variable are associated with below- average values of the other variable. That is, two variables are negatively associated if, whenever the value of one variable increases, the value of the other variable decreases.
  • 12. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-12 Objective 2 • Describe the Properties of the Linear Correlation Coefficient
  • 13. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-13 The linear correlation coefficient or Pearson product moment correlation coefficient is a measure of the strength and direction of the linear relation between two quantitative variables. The Greek letter ρ (rho) represents the population correlation coefficient, and r represents the sample correlation coefficient. We present only the formula for the sample correlation coefficient.
  • 14. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-14 Sample Linear Correlation Coefficient where is the sample mean of the explanatory variable sx is the sample standard deviation of the explanatory variable is the sample mean of the response variable sy is the sample standard deviation of the response variable n is the number of individuals in the sample x y r = xi − x sx     yi − y sy      ∑ n −1
  • 15. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-15 Properties of the Linear Correlation Coefficient 1.The linear correlation coefficient is always between –1 and 1, inclusive. That is, –1 ≤ r ≤ 1. 2.If r = + 1, then a perfect positive linear relation exists between the two variables. 3.If r = –1, then a perfect negative linear relation exists between the two variables. 4.The closer r is to +1, the stronger is the evidence of positive association between the two variables. 5.The closer r is to –1, the stronger is the evidence of negative association between the two variables.
  • 16. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-16 6. If r is close to 0, then little or no evidence exists of a linear relation between the two variables. So r close to 0 does not imply no relation, just no linear relation. 7. The linear correlation coefficient is a unitless measure of association. So the unit of measure for x and y plays no role in the interpretation of r. 8. The correlation coefficient is not resistant. Therefore, an observation that does not follow the overall pattern of the data could affect the value of the linear correlation coefficient.
  • 17. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-17
  • 18. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-18
  • 19. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-19 Objective 3 • Compute and Interpret the Linear Correlation Coefficient
  • 20. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-20 EXAMPLE Determining the Linear Correlation CoefficientEXAMPLE Determining the Linear Correlation Coefficient Determine the linear correlation coefficient of the drilling data.
  • 21. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-21 xi − x sx yi − y sy xi − x sx     yi − y sy       xi − x sx     yi − y sy      ∑ =x = y =
  • 22. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-22 r = xi − x sx    ∑ yi − y sy       n −1 = 8.501037 12 −1 = 0.773
  • 23. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-23
  • 24. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-24 Objective 4 • Determine Whether a Linear Relation Exists between Two Variables
  • 25. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-25 Testing for a Linear Relation Step 1 Determine the absolute value of the correlation coefficient. Step 2 Find the critical value in Table II from Appendix A for the given sample size. Step 3 If the absolute value of the correlation coefficient is greater than the critical value, we say a linear relation exists between the two variables. Otherwise, no linear relation exists.
  • 26. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-26 EXAMPLE Does a Linear Relation Exist?EXAMPLE Does a Linear Relation Exist? Determine whether a linear relation exists between time to drill five feet and depth at which drilling begins. Comment on the type of relation that appears to exist between time to drill five feet and depth at which drilling begins. The correlation between drilling depth and time to drill is 0.773. The critical value for n = 12 observations is 0.576. Since 0.773 > 0.576, there is a positive linear relation between time to drill five feet and depth at which drilling begins.
  • 27. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-27 Objective 5 • Explain the Difference between Correlation and Causation
  • 28. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-28 According to data obtained from the Statistical Abstract of the United States, the correlation between the percentage of the female population with a bachelor’s degree and the percentage of births to unmarried mothers since 1990 is 0.940. Does this mean that a higher percentage of females with bachelor’s degrees causes a higher percentage of births to unmarried mothers?
  • 29. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-29 Certainly not! The correlation exists only because both percentages have been increasing since 1990. It is this relation that causes the high correlation. In general, time series data (data collected over time) may have high correlations because each variable is moving in a specific direction over time (both going up or down over time; one increasing, while the other is decreasing over time). When data are observational, we cannot claim a causal relation exists between two variables. We can only claim causality when the data are collected through a designed experiment.
  • 30. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-30 Another way that two variables can be related even though there is not a causal relation is through a lurking variable. A lurking variable is related to both the explanatory and response variable. For example, ice cream sales and crime rates have a very high correlation. Does this mean that local governments should shut down all ice cream shops? No! The lurking variable is temperature. As air temperatures rise, both ice cream sales and crime rates rise.
  • 31. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-31 EXAMPLE Lurking Variables in a Bone Mineral Density StudyEXAMPLE Lurking Variables in a Bone Mineral Density Study Because colas tend to replace healthier beverages and colas contain caffeine and phosphoric acid, researchers Katherine L. Tucker and associates wanted to know whether cola consumption is associated with lower bone mineral density in women. The table lists the typical number of cans of cola consumed in a week and the femoral neck bone mineral density for a sample of 15 women. The data were collected through a prospective cohort study.
  • 32. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-32 EXAMPLE Lurking Variables in a Bone Mineral Density StudyEXAMPLE Lurking Variables in a Bone Mineral Density Study The figure on the next slide shows the scatter diagram of the data. The correlation between number of colas per week and bone mineral density is –0.806.The critical value for correlation with n = 15 from Table II in Appendix A is 0.514. Because |–0.806| > 0.514, we conclude a negative linear relation exists between number of colas consumed and bone mineral density. Can the authors conclude that an increase in the number of colas consumed causes a decrease in bone mineral density? Identify some lurking variables in the study.
  • 33. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-33 EXAMPLE Lurking Variables in a Bone Mineral Density StudyEXAMPLE Lurking Variables in a Bone Mineral Density Study
  • 34. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-34 EXAMPLE Lurking Variables in a Bone Mineral Density StudyEXAMPLE Lurking Variables in a Bone Mineral Density Study In prospective cohort studies, data are collected on a group of subjects through questionnaires and surveys over time. Therefore, the data are observational. So the researchers cannot claim that increased cola consumption causes a decrease in bone mineral density. Some lurking variables in the study that could confound the results are: • body mass index • height • smoking • alcohol consumption • calcium intake • physical activity
  • 35. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-35 EXAMPLE Lurking Variables in a Bone Mineral Density StudyEXAMPLE Lurking Variables in a Bone Mineral Density Study The authors were careful to say that increased cola consumption is associated with lower bone mineral density because of potential lurking variables. They never stated that increased cola consumption causes lower bone mineral density.
  • 36. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc. Section Least-squares Regression 4.2
  • 37. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-37
  • 38. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc. Objectives 1. Find the least-squares regression line and use the line to make predictions 2. Interpret the slope and the y-intercept of the least-squares regression line 3. Compute the sum of squared residuals 4-38
  • 39. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-39 Using the following sample data: Using (2, 5.7) and (6, 1.9): m = 5.7 −1.9 2 − 6 = −0.95 y − y1 = m x − x1( ) y − 5.7 = −0.95 x − 2( ) y − 5.7 = −0.95x +1.9 y = −0.95x + 7.6 EXAMPLE Finding an Equation that Describes Linearly Relate Data EXAMPLE Finding an Equation that Describes Linearly Relate Data (a) Find a linear equation that relates x (the explanatory variable) and y (the response variable) by selecting two points and finding the equation of the line containing the points.
  • 40. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-40 (b) Graph the equation on the scatter diagram. (c) Use the equation to predict y if x = 3. y = −0.95x + 7.6 = −0.95(3) + 7.6 = 4.75
  • 41. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-41 Objective 1 • Find the least-squares regression line and use the line to make predictions
  • 42. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-42 } (3, 5.2) residual = observed y – predicted y = 5.2 – 4.75 = 0.45 The difference between the observed value of y and the predicted value of y is the error, or residual. Using the line from the last example, and the predicted value at x = 3: residual = observed y – predicted y = 5.2 – 4.75 = 0.45
  • 43. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-43 Least-Squares Regression Criterion The least-squares regression line is the line that minimizes the sum of the squared errors (or residuals). This line minimizes the sum of the squared vertical distance between the observed values of y and those predicted by the line , (“y-hat”). We represent this as “ minimize Σ residuals2 ”. ˆy
  • 44. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-44 The Least-Squares Regression Line The equation of the least-squares regression line is given by ˆy = b1x + b0 where where is the slope of the least-squares regression line b1 = r ⋅ sy sx b0 = y − b1x is the y-intercept of the least- squares regression line
  • 45. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-45 The Least-Squares Regression Line Note: is the sample mean and sx is the sample standard deviation of the explanatory variable x ; is the sample mean and sy is the sample standard deviation of the response variable y. y x
  • 46. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-46 EXAMPLE Finding the Least-squares Regression LineEXAMPLE Finding the Least-squares Regression Line Using the drilling data (a)Find the least-squares regression line. (b)Predict the drilling time if drilling starts at 130 feet. (c)Is the observed drilling time at 130 feet above, or below, average. (d)Draw the least-squares regression line on the scatter diagram of the data.
  • 47. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-47 (a) We agree to round the estimates of the slope and intercept to four decimal places. (b) (c) The observed drilling time is 6.93 seconds. The predicted drilling time is 7.035 seconds. The drilling time of 6.93 seconds is below average. ˆy = 0.0116x + 5.5273 ˆy = 0.0116x + 5.5273 = 0.0116(130) + 5.5273 = 7.035
  • 48. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-48 (d)
  • 49. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-49 Objective 2 • Interpret the Slope and the y-intercept of the Least-Squares Regression Line
  • 50. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-50 Interpretation of Slope: The slope of the regression line is 0.0116. For each additional foot of depth we start drilling, the time to drill five feet increases by 0.0116 minutes, on average.
  • 51. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-51 Interpretation of the y-Intercept: The y-intercept of the regression line is 5.5273. To interpret the y- intercept, we must first ask two questions: 1. Is 0 a reasonable value for the explanatory variable? 2. Do any observations near x = 0 exist in the data set? A value of 0 is reasonable for the drilling data (this indicates that drilling begins at the surface of Earth. The smallest observation in the data set is x = 35 feet, which is reasonably close to 0. So, interpretation of the y-intercept is reasonable. The time to drill five feet when we begin drilling at the surface of Earth is 5.5273 minutes.
  • 52. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-52 If the least-squares regression line is used to make predictions based on values of the explanatory variable that are much larger or much smaller than the observed values, we say the researcher is working outside the scope of the model. Never use a least-squares regression line to make predictions outside the scope of the model because we can’t be sure the linear relation continues to exist.
  • 53. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-53 Objective 3 • Compute the Sum of Squared Residuals
  • 54. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-54 To illustrate the fact that the sum of squared residuals for a least-squares regression line is less than the sum of squared residuals for any other line, use the “regression by eye” applet.
  • 55. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc. Section The Coefficient of Determination 4.3
  • 56. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-56 Objective 1 • Compute and Interpret the Coefficient of Determination
  • 57. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-57 The coefficient of determination, R2 , measures the proportion of total variation in the response variable that is explained by the least-squares regression line. The coefficient of determination is a number between 0 and 1, inclusive. That is, 0 < R2 < 1. If R2 = 0 the line has no explanatory value If R2 = 1 means the line explains 100% of the variation in the response variable.
  • 58. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-58 The data to the right are based on a study for drilling rock. The researchers wanted to determine whether the time it takes to dry drill a distance of 5 feet in rock increases with the depth at which the drilling begins. So, depth at which drilling begins is the predictor variable, x, and time (in minutes) to drill five feet is the response variable, y. Source: Penner, R., and Watts, D.G. “Mining Information.” The American Statistician, Vol. 45, No. 1, Feb. 1991, p. 6.
  • 59. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-59
  • 60. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-60 Regression Analysis The regression equation is Time = 5.53 + 0.0116 Depth Sample Statistics Mean Standard Deviation Depth 126.2 52.2 Time 6.99 0.781 Correlation Between Depth and Time: 0.773
  • 61. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-61 Suppose we were asked to predict the time to drill an additional 5 feet, but we did not know the current depth of the drill. What would be our best “guess”?
  • 62. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-62 Suppose we were asked to predict the time to drill an additional 5 feet, but we did not know the current depth of the drill. What would be our best “guess”? ANSWER: The mean time to drill an additional 5 feet: 6.99 minutes
  • 63. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-63 Now suppose that we are asked to predict the time to drill an additional 5 feet if the current depth of the drill is 160 feet? ANSWER: Our “guess” increased from 6.99 minutes to 7.39 minutes based on the knowledge that drill depth is positively associated with drill time. ˆy = 5.53+ 0.0116(160) = 7.39
  • 64. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-64
  • 65. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-65 The difference between the observed value of the response variable and the mean value of the response variable is called the total deviation and is equal to y − y
  • 66. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-66 The difference between the predicted value of the response variable and the mean value of the response variable is called the explained deviation and is equal to ˆy − y
  • 67. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-67 The difference between the observed value of the response variable and the predicted value of the response variable is called the unexplained deviation and is equal to y − ˆy
  • 68. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-68 Total Deviation y − y = y − ˆy( ) + ˆy − y( ) Unexplained Deviation Explained Deviation +=
  • 69. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-69 Total Deviation y − y( )2 ∑ = y − ˆy( )2 ∑ + ˆy − y( )2 ∑ Unexplained Deviation Explained Deviation +=
  • 70. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-70 Total Variation = Unexplained Variation + Explained Variation 1 = Unexplained Variation Explained Variation Unexplained VariationExplained Variation Total Variation Total Variation Total VariationTotal Variation + = 1 –R2 =
  • 71. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-71 To determine R2 for the linear regression model simply square the value of the linear correlation coefficient. To determine R2 for the linear regression model simply square the value of the linear correlation coefficient. Squaring the linear correlation coefficient to obtain the coefficient of determination works only for the least-squares linear regression model The method does not work in general. ˆy = b1x + b0
  • 72. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-72 EXAMPLE Determining the Coefficient of DeterminationEXAMPLE Determining the Coefficient of Determination Find and interpret the coefficient of determination for the drilling data. Because the linear correlation coefficient, r, is 0.773, we have that R2 = 0.7732 = 0.5975 = 59.75%. So, 59.75% of the variability in drilling time is explained by the least-squares regression line.
  • 73. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-73 Draw a scatter diagram for each of these data sets. For each data set, the variance of y is 17.49.
  • 74. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-74 Data Set A Data Set B Data Set C Data Set A: 99.99% of the variability in y is explained by the least-squares regression line Data Set B: 94.7% of the variability in y is explained by the least-squares regression line Data Set C: 9.4% of the variability in y is explained by the least-squares regression line
  • 75. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc. Section Contingency Tables and Association 4.4
  • 76. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc. Objectives 1. Compute the marginal distribution of a variable 2. Use the conditional distribution to identify association among categorical data 3. Explain Simpson’s Paradox 4-76
  • 77. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-77 A professor at a community college in New Mexico conducted a study to assess the effectiveness of delivering an introductory statistics course via traditional lecture-based method, online delivery (no classroom instruction), and hybrid instruction (online course with weekly meetings) methods, the grades students received in each of the courses were tallied. The table is referred to as a contingency table, or two-way table, because it relates two categories of data. The row variable is grade, because each row in the table describes the grade received for each group. The column variable is delivery method. Each box inside the table is referred to as a cell.
  • 78. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-78 Objective 1 • Compute the Marginal Distribution of a Variable
  • 79. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-79 A marginal distribution of a variable is a frequency or relative frequency distribution of either the row or column variable in the contingency table.
  • 80. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-80 EXAMPLE Determining Frequency Marginal DistributionsEXAMPLE Determining Frequency Marginal Distributions A professor at a community college in New Mexico conducted a study to assess the effectiveness of delivering an introductory statistics course via traditional lecture-based method, online delivery (no classroom instruction), and hybrid instruction (online course with weekly meetings) methods, the grades students received in each of the courses were tallied. Find the frequency marginal distributions for course grade and delivery method.
  • 81. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-81 EXAMPLE Determining Relative Frequency Marginal DistributionsEXAMPLE Determining Relative Frequency Marginal Distributions Determine the relative frequency marginal distribution for course grade and delivery method.
  • 82. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-82 Objective 2 • Use the Conditional Distribution to Identify Association among Categorical Data
  • 83. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-83 A conditional distribution lists the relative frequency of each category of the response variable, given a specific value of the explanatory variable in the contingency table.
  • 84. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-84 EXAMPLE Determining a Conditional DistributionEXAMPLE Determining a Conditional Distribution Construct a conditional distribution of course grade by method of delivery. Comment on any type of association that may exist between course grade and delivery method. It appears that students in the hybrid course are more likely to pass (A, B, or C) than the other two methods.
  • 85. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-85 EXAMPLE Drawing a Bar Graph of a Conditional DistributionEXAMPLE Drawing a Bar Graph of a Conditional Distribution Using the results of the previous example, draw a bar graph that represents the conditional distribution of method of delivery by grade earned.
  • 86. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-86 The following contingency table shows the survival status and demographics of passengers on the ill-fated Titanic. Draw a conditional bar graph of survival status by demographic characteristic.
  • 87. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-87 Objective 1 • Explain Simpson’s Paradox
  • 88. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-88 Insulin dependent (or Type 1) diabetes is a disease that results in the permanent destruction of insulin- producing beta cells of the pancreas. Type 1 diabetes is lethal unless treatment with insulin injections replaces the missing hormone. Individuals with insulin independent (or Type 2) diabetes can produce insulin internally. The data shown in the table below represent the survival status of 902 patients with diabetes by type over a 5-year period. Type 1 Type 2 Total Survived 253 326 579 Died 105 218 323 358 544 902 EXAMPLE Illustrating Simpson’s ParadoxEXAMPLE Illustrating Simpson’s Paradox
  • 89. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-89 EXAMPLE Illustrating Simpson’s ParadoxEXAMPLE Illustrating Simpson’s Paradox From the table, the proportion of patients with Type 1 diabetes who died was 105/358 = 0.29; the proportion of patients with Type 2 diabetes who died was 218/544 = 0.40. Based on this, we might conclude that Type 2 diabetes is more lethal than Type 1 diabetes. Type 1 Type 2 Total Survived 253 326 579 Died 105 218 323 358 544 902
  • 90. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-90 However, Type 2 diabetes is usually contracted after the age of 40. If we account for the variable age and divide our patients into two groups (those 40 or younger and those over 40), we obtain the data in the table below. Type 1 Type 2 Total < 40 > 40 < 40 > 40 Survived 129 124 15 311 579 Died 1 104 0 218 323 130 228 15 529 902
  • 91. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-91 Of the diabetics 40 years of age or younger, the proportion of those with Type 1 diabetes who died is 1/130 = 0.008; the proportion of those with Type 2 diabetes who died is 0/15 = 0. Type 1 Type 2 Total < 40 > 40 < 40 > 40 Survived 129 124 15 311 579 Died 1 104 0 218 323 130 228 15 529 902
  • 92. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-92 Of the diabetics over 40 years of age, the proportion of those with Type 1 diabetes who died is 104/228 = 0.456; the proportion of those with Type 2 diabetes who died is 218/529 = 0.412. The lurking variable age led us to believe that Type 2 diabetes is the more dangerous type of diabetes. Type 1 Type 2 Total < 40 > 40 < 40 > 40 Survived 129 124 15 311 579 Died 1 104 0 218 323 130 228 15 529 902
  • 93. Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc.4-93 Simpson’s Paradox represents a situation in which an association between two variables inverts or goes away when a third variable is introduced to the analysis.