SlideShare a Scribd company logo
Chapter 13 Simple Linear Regression & Correlation Inferential Methods
[object Object],[object Object],Deterministic Models
[object Object],[object Object],[object Object],Probabilistic Models Y = deterministic function of x + random deviation = f(x) +  e
Probabilistic Models Deviations from the deterministic part of a probabilistic model e=-1.5
Simple Linear Regression  Model The  simple linear regression model  assumes that there is a line with vertical or y intercept a and slope b, called the  true  or  population regression line.   When a value of the independent variable x is fixed and an observation on the dependent variable y is made, y =    +   x +  e Without the random deviation  e , all observed points  (x, y) points would fall exactly on the population regression line. The inclusion of e in the model equation allows points to deviate from the line by random amounts.
Simple Linear Regression  Model 0 0 x = x 1 x = x 2 e 2 Observation when x = x 1 (positive deviation) e 2 Observation when x = x 2 (positive deviation)    = vertical intercept Population regression line (Slope   )
Basic Assumptions of the Simple Linear Regression  Model ,[object Object],[object Object],[object Object],[object Object]
More About the Simple Linear Regression  Model and (standard deviation of y for fixed x) =   . For any fixed x value, y itself has a normal distribution.
Interpretation of Terms ,[object Object],[object Object],[object Object],Small   Large  
Illustration of Assumptions
Estimates for the Regression Line The point estimates of   , the slope, and   , the y intercept of the population regression line, are the slope and y intercept, respectively, of the least squares line.  That is,
Interpretation of y = a + bx ,[object Object],[object Object],[object Object]
Example The following data was collected in a study of age and fatness in humans. * Mazess, R.B., Peppler, W.W., and Gibbons, M. (1984) Total body composition by dual-photon ( 153 Gd) absorptiometry.  American Journal of Clinical Nutrition ,  40 , 834-839 One of the questions was, “What is the relationship between age and fatness?”
Example
Example
Example
Example A point estimate for the %Fat  for a human who is 45 years old is  If 45 is put into the equation for x, we have both an  estimated %Fat for a 45 year old human  or an  estimated average %Fat for 45 year old humans The two interpretations are quite different.
Example A plot of the data points along with the least squares regression line created with Minitab is given to the right.
Terminology
Definition formulae The  total sum of squares , denoted by  SSTo , is defined as The  residual sum of squares , denoted by  SSResid , is defined as
Calculation Formulae Recalled SSTo  and  SSResid  are generally found as part of the standard output from most statistical packages or can be obtained using the following computational formulas:
Coefficient of Determination ,[object Object]
Estimated Standard Deviation, s e The statistic for estimating the variance   2  is where
Estimated Standard Deviation, s e The estimate of    is the  estimated standard deviation The number of degrees of freedom associated with estimating     or    in simple linear regression is n - 2.
Example continued
Example continued
Example continued With r 2  = 0.627 or 62.7%, we can say that 62.7% of the observed variation in %Fat can be attributed to the probabilistic linear relationship with human age. The magnitude of a typical sample deviation from the least squares line is about 5.75(%) which is reasonably large compared to the y values themselves.  This would suggest that the model is only useful in the sense of provide gross “ballpark” estimates for %Fat for humans based on age.
Properties of the Sampling Distribution of b ,[object Object],[object Object],When the four basic assumptions of the simple linear regression model are satisfied, the following conditions are met: ,[object Object],[object Object]
Estimated Standard Deviation of b The  estimated standard deviation of the statistic b  is When then four basic assumptions of the simple linear regression model are satisfied, the probability distribution of the standardized variable is the t distribution with df = n - 2
Confidence interval for   When then four basic assumptions of the simple linear regression model are satisfied, a  confidence interval for   , the slope of the population regression line, has the form b    (t critical value)  s b where the t critical value is based on  df = n - 2.
Example continued Recall A 95% confidence interval estimate for    is
Example continued Based on sample data, we are 95% confident that the true mean increase in %Fat associated with a year of age is between 0.324% and 0.772%. A 95% confidence interval estimate for    is
Example continued The regression equation is % Fat y = 3.22 + 0.548 Age (x) Predictor  Coef  SE Coef  T  P Constant  3.221  5.076  0.63  0.535 Age (x)  0.5480  0.1056  5.19  0.000 S = 5.754  R-Sq = 62.7%  R-Sq(adj) = 60.4% Analysis of Variance Source  DF  SS  MS  F  P Regression  1  891.87  891.87  26.94  0.000 Residual Error  16  529.66  33.10 Total  17  1421.54 Minitab output looks like Regression Analysis: % Fat y versus Age (x) Regression line residual df  = n -2  SSResid SSTo Estimated slope b Estimated y intercept a
Hypothesis Tests Concerning   Null hypothesis:  H 0 :    = hypothesized value
Hypothesis Tests Concerning   ,[object Object],[object Object],[object Object],[object Object],[object Object]
Hypothesis Tests Concerning   ,[object Object],[object Object],[object Object]
Hypothesis Tests Concerning   ,[object Object],[object Object],[object Object],[object Object],[object Object]
Hypothesis Tests Concerning   Quite often the test is performed with the hypotheses H 0 :    = 0 vs. H a :       0 This particular form of the test is called the  model utility test for simple linear regression. The null hypothesis specifies that there is  no  useful linear relationship between x and y, whereas the alternative hypothesis specifies that there is a useful linear relationship between x and y. The test statistic simplifies to  and is called the  t ratio .
Example Consider the following data on percentage unemployment and suicide rates. * Smith, D. (1977)  Patterns in Human Geography , Canada: Douglas David and Charles Ltd., 158.
Example The plot of the data points produced by Minitab follows
Example
Example Some basic summary statistics
Example Continuing with the calculations
Example Continuing with the calculations
Example
Example - Model Utility Test ,[object Object],[object Object],[object Object],[object Object],[object Object]
Example - Model Utility Test ,[object Object]
Example - Model Utility Test ,[object Object],[object Object]
Example - Model Utility Test ,[object Object],[object Object]
Example - Minitab Output Regression Analysis: Suicide Rate (y) versus Percentage Unemployed (x) The regression equation is Suicide Rate (y) = - 93.9 + 59.1 Percentage Unemployed (x) Predictor  Coef  SE Coef  T  P Constant  -93.86  51.25  -1.83  0.100 Percenta  59.05  14.24  4.15  0.002 S = 36.06  R-Sq = 65.7%  R-Sq(adj) = 61.8% T value for Model Utility Test H 0 :    = 0  H a :       0 P-value
Residual Analysis ,[object Object],[object Object],[object Object],[object Object]
Residual Analysis To check on these assumptions, one would examine the deviations  e 1 , e 2 , …, e n . Generally, the deviations are not known, so we check on the assumptions by looking at the residuals which are the deviations from the estimated line, a + bx.  The residuals are given by
Standardized Residuals Recall: A quantity is standardized by subtracting its mean value and then dividing by its true (or estimated) standard deviation. For the residuals, the true mean is zero (0) if the assumptions are true.  The estimated standard deviation of a residual depends on the x value. The estimated standard deviation of the i th  residual,  , is given by
Standardized Residuals As you can see from the formula for the estimated standard deviation the calculation of the standardized residuals is a bit of a calculational nightmare.  Fortunately, most statistical software packages are set up to perform these calculations and do so quite proficiently.
Standardized Residuals - Example Consider the data on percentage unemployment and suicide rates Notice that the standardized residual for Pittsburgh is -2.50, somewhat large for this size data set.
Example Pittsburgh This point has an unusually high residual
Normal Plots Notice that both of the normal plots look similar. If a software package is available to do the calculation and plots, it is preferable to look at the normal plot of the standardized residuals. In both cases, the points look reasonable linear with the possible exception of Pittsburgh, so the assumption that the errors are normally distributed seems to be supported by the sample data.
More Comments The fact that Pittsburgh has a large standardized residual makes it worthwhile to look at that city carefully to make sure the figures were reported correctly. One might also look to see if there are some reasons that Pittsburgh should be looked at separately because some other characteristic distinguishes it from all of the other cities. Pittsburgh does have a large effect on model.
Visual Interpretation of Standardized Residuals This plot is an example of a satisfactory plot that indicates that the model assumptions are reasonable.
Visual Interpretation of Standardized Residuals This plot suggests that a curvilinear regression model is needed.
Visual Interpretation of Standardized Residuals This plot suggests a non-constant variance. The assumptions of the model are not correct.
Visual Interpretation of Standardized Residuals This plot shows a data point with a large standardized residual.
Visual Interpretation of Standardized Residuals This plot shows a potentially influential observation.
Example  - % Unemployment vs. Suicide Rate This plot of the residuals (errors) indicates some possible problems with this linear model. You can see a pattern to the points. Generally decreasing pattern to these points. Unusually large residual These two points are quite influential since they are far away from the others in terms of the % unemployed
Properties of the Sampling Distribution of a + bx for a Fixed x Value ,[object Object],[object Object]
Properties of the Sampling Distribution of a + bx for a Fixed x Value ,[object Object],[object Object]
Addition Information about the Sampling Distribution of a + bx for a Fixed x Value The  estimated standard deviation of the statistic a + bx*,  denoted by s a+bx* , is given by When the four basic assumptions of the simple linear regression model are satisfied, the probability distribution of the standardized variable is the t distribution with df = n - 2.
Confidence Interval for a Mean y Value When the four basic assumptions of the simple linear regression model are met, a  confidence interval for a + bx* , the average y value when x has the value x*, is a + bx*    (t critical value)s a+bx* Where the t critical value is based on  df = n -2. Many authors give the following equivalent form for the confidence interval.
Confidence Interval for a Single y Value When the four basic assumptions of the simple linear regression model are met, a  prediction interval for y* , a single y observation made when x has the value x*, has the form Where the t critical value is based on df = n -2. Many authors give the following equivalent form for the prediction interval.
Example  - Mean Annual Temperature vs. Mortality Data was collected in certain regions of Great Britain, Norway and Sweden to study the relationship between the mean annual temperature and the mortality rate for a specific type of breast cancer in women. * Lea, A.J. (1965) New Observations on distribution of neoplasms of female breast in certain European countries.  British Medical Journal ,  1 , 488-490
Example  - Mean Annual Temperature vs. Mortality Regression Analysis: Mortality index versus Mean annual temperature   The regression equation is Mortality index = - 21.8 + 2.36 Mean annual temperature   Predictor  Coef  SE Coef  T  P Constant  -21.79  15.67  -1.39  0.186 Mean ann  2.3577  0.3489  6.76  0.000   S = 7.545  R-Sq = 76.5%  R-Sq(adj) = 74.9%   Analysis of Variance   Source  DF  SS  MS  F  P Regression  1  2599.5  2599.5  45.67  0.000 Residual Error  14  796.9  56.9 Total  15  3396.4   Unusual Observations Obs  Mean ann  Mortalit  Fit  SE Fit  Residual  St Resid 15  31.8  67.30  53.18  4.85  14.12  2.44RX   R denotes an observation with a large standardized residual X denotes an observation whose X value gives it large influence.
Example  - Mean Annual Temperature vs. Mortality The point has a large standardized residual and is influential because of the low Mean Annual Temperature.
Example  - Mean Annual Temperature vs. Mortality These are the x* values for which the above fits, standard errors of the fits, 95% confidence intervals for Mean y values and prediction intervals for y values given above. Predicted Values for New Observations   New Obs  Fit  SE Fit  95.0% CI  95.0% PI 1  53.18  4.85  (  42.79,  63.57)  (  33.95,  72.41) X  2  60.72  3.84  (  52.48,  68.96)  (  42.57,  78.88)  3  72.51  2.48  (  67.20,  77.82)  (  55.48,  89.54)  4  83.34  1.89  (  79.30,  87.39)  (  66.66,  100.02)  5  96.09  2.67  (  90.37,  101.81)  (  78.93,  113.25)  6  99.16  3.01  (  92.71,  105.60)  (  81.74,  116.57)  X  denotes a row with X values away from the center   Values of Predictors for New Observations   New Obs  Mean ann 1  31.8 2  35.0 3  40.0 4  44.6 5  50.0 6  51.3
Example  - Mean Annual Temperature vs. Mortality 95% prediction interval for single y value at x = 45.  (67.62,100.98) 95% confidence interval for Mean y value at x = 40.  (67.20,  77.82)
A Test for Independence in a Bivariate Normal Population Null hypothesis:  H 0 :    = 0 Assumption:  r is the correlation coefficient for a  random sample  from a  bivariate normal population. Test statistic: The t critical value is based on df = n - 2
A Test for Independence in a Bivariate Normal Population Alternate hypothesis:  H 0 :    > 0  (Positive dependence): P-value is the area under the appropriate t curve to the right of the computed t. Alternate hypothesis:  H 0 :    < 0  (Negative dependence): P-value is the area under the appropriate t curve to the right of the computed t. ,[object Object],[object Object],[object Object],[object Object]
Example Recall the data from the study of %Fat vs. Age for humans. There are 18 data points and a quick calculation of the Pierson correlation coefficient gives  r = 0.79209. We will test to see if there is a dependence at the 0.05 significance level.
Example ,[object Object],[object Object],[object Object],[object Object],[object Object]
Example ,[object Object]
Another Example Height vs. Joint Length The professor in an elementary statistics class wanted to explain correlation so he needed some bivariate data. He asked his class (presumably a random or representative sample of late adolescent humans) to measure the length of the metacarpal bone on the index finger of the right hand (in cm) and height (in ft). The data are provided on the next slide.
Example - Height vs. Joint Length There are 17 data points and a quick calculation of the Pierson correlation coefficient gives r = 0.74908. We will test to see if the true population correlation coefficient is positive at the 0.05 level of significance.
Example - Height vs. Joint Length ,[object Object],[object Object],[object Object],[object Object],[object Object]
Example - Height vs. Joint Length ,[object Object]
Example - Height vs. Joint Length ,[object Object],[object Object],[object Object]

More Related Content

What's hot

Regression analysis
Regression analysisRegression analysis
Regression analysis
Parminder Singh
 
Ordinal Logistic Regression
Ordinal Logistic RegressionOrdinal Logistic Regression
Ordinal Logistic Regression
Al-Ahmadgaid Asaad
 
Estimation
EstimationEstimation
Estimation
Mmedsc Hahm
 
Binomial,Poisson,Geometric,Normal distribution
Binomial,Poisson,Geometric,Normal distributionBinomial,Poisson,Geometric,Normal distribution
Binomial,Poisson,Geometric,Normal distribution
Bharath kumar Karanam
 
Multiple Linear Regression II and ANOVA I
Multiple Linear Regression II and ANOVA IMultiple Linear Regression II and ANOVA I
Multiple Linear Regression II and ANOVA I
James Neill
 
Regression analysis
Regression analysisRegression analysis
Regression analysissaba khan
 
Multiple Regression and Logistic Regression
Multiple Regression and Logistic RegressionMultiple Regression and Logistic Regression
Multiple Regression and Logistic Regression
Kaushik Rajan
 
General Linear Model | Statistics
General Linear Model | StatisticsGeneral Linear Model | Statistics
General Linear Model | Statistics
Transweb Global Inc
 
Regression Analysis
Regression AnalysisRegression Analysis
Regression Analysis
Birinder Singh Gulati
 
Confirmatory Factor Analysis
Confirmatory Factor AnalysisConfirmatory Factor Analysis
Confirmatory Factor Analysis
Economic Research Forum
 
Regression Analysis
Regression AnalysisRegression Analysis
Regression Analysis
Shiela Vinarao
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
Srikant001p
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
DrZahid Khan
 
Regression analysis ppt
Regression analysis pptRegression analysis ppt
Regression analysis pptElkana Rorio
 
Estimation in statistics
Estimation in statisticsEstimation in statistics
Estimation in statistics
Rabea Jamal
 
Ch4 Confidence Interval
Ch4 Confidence IntervalCh4 Confidence Interval
Ch4 Confidence Interval
Farhan Alfin
 
Review & Hypothesis Testing
Review & Hypothesis TestingReview & Hypothesis Testing
Review & Hypothesis Testing
Sr Edith Bogue
 
Generalized linear model
Generalized linear modelGeneralized linear model
Generalized linear model
Rahul Rockers
 

What's hot (20)

Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Ordinal Logistic Regression
Ordinal Logistic RegressionOrdinal Logistic Regression
Ordinal Logistic Regression
 
Estimation
EstimationEstimation
Estimation
 
Confidence Intervals
Confidence IntervalsConfidence Intervals
Confidence Intervals
 
Binomial,Poisson,Geometric,Normal distribution
Binomial,Poisson,Geometric,Normal distributionBinomial,Poisson,Geometric,Normal distribution
Binomial,Poisson,Geometric,Normal distribution
 
Multiple Linear Regression II and ANOVA I
Multiple Linear Regression II and ANOVA IMultiple Linear Regression II and ANOVA I
Multiple Linear Regression II and ANOVA I
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Multiple Regression and Logistic Regression
Multiple Regression and Logistic RegressionMultiple Regression and Logistic Regression
Multiple Regression and Logistic Regression
 
General Linear Model | Statistics
General Linear Model | StatisticsGeneral Linear Model | Statistics
General Linear Model | Statistics
 
Regression Analysis
Regression AnalysisRegression Analysis
Regression Analysis
 
Confirmatory Factor Analysis
Confirmatory Factor AnalysisConfirmatory Factor Analysis
Confirmatory Factor Analysis
 
Regression Analysis
Regression AnalysisRegression Analysis
Regression Analysis
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Regression analysis ppt
Regression analysis pptRegression analysis ppt
Regression analysis ppt
 
Estimation in statistics
Estimation in statisticsEstimation in statistics
Estimation in statistics
 
Ch4 Confidence Interval
Ch4 Confidence IntervalCh4 Confidence Interval
Ch4 Confidence Interval
 
Review & Hypothesis Testing
Review & Hypothesis TestingReview & Hypothesis Testing
Review & Hypothesis Testing
 
Generalized linear model
Generalized linear modelGeneralized linear model
Generalized linear model
 

Viewers also liked

Simple linear regression (final)
Simple linear regression (final)Simple linear regression (final)
Simple linear regression (final)Harsh Upadhyay
 
Ch14
Ch14Ch14
Ch14
Evil Man
 
C2.1 intro
C2.1 introC2.1 intro
C2.1 intro
Daniel LIAO
 
Simple Linier Regression
Simple Linier RegressionSimple Linier Regression
Simple Linier Regressiondessybudiyanti
 
Lesson 8 Linear Correlation And Regression
Lesson 8 Linear Correlation And RegressionLesson 8 Linear Correlation And Regression
Lesson 8 Linear Correlation And RegressionSumit Prajapati
 
Linear Regression Ordinary Least Squares Distributed Calculation Example
Linear Regression Ordinary Least Squares Distributed Calculation ExampleLinear Regression Ordinary Least Squares Distributed Calculation Example
Linear Regression Ordinary Least Squares Distributed Calculation Example
Marjan Sterjev
 
Simple linear regression project
Simple linear regression projectSimple linear regression project
Simple linear regression project
JAPAN SHAH
 
Regression: A skin-deep dive
Regression: A skin-deep diveRegression: A skin-deep dive
Regression: A skin-deep dive
abulyomon
 
Simple Linear Regression
Simple Linear RegressionSimple Linear Regression
Simple Linear RegressionSharlaine Ruth
 
Simple linear regression
Simple linear regressionSimple linear regression
Simple linear regression
Maria Theresa
 
Simple Linear Regression (simplified)
Simple Linear Regression (simplified)Simple Linear Regression (simplified)
Simple Linear Regression (simplified)
Haoran Zhang
 
Linear regression
Linear regression Linear regression
Linear regression
Babasab Patil
 
Chap12 simple regression
Chap12 simple regressionChap12 simple regression
Chap12 simple regression
Uni Azza Aunillah
 
NG BB 36 Simple Linear Regression
NG BB 36 Simple Linear RegressionNG BB 36 Simple Linear Regression
NG BB 36 Simple Linear RegressionLeanleaders.org
 
Linear regression without tears
Linear regression without tearsLinear regression without tears
Linear regression without tears
Ankit Sharma
 
Linear regression
Linear regressionLinear regression
Linear regression
vermaumeshverma
 
Linear regression
Linear regressionLinear regression
Linear regressionTech_MX
 
Regression analysis
Regression analysisRegression analysis
Regression analysisRavi shankar
 
An Overview of Simple Linear Regression
An Overview of Simple Linear RegressionAn Overview of Simple Linear Regression
An Overview of Simple Linear Regression
Georgian Court University
 
Pearson Correlation, Spearman Correlation &Linear Regression
Pearson Correlation, Spearman Correlation &Linear RegressionPearson Correlation, Spearman Correlation &Linear Regression
Pearson Correlation, Spearman Correlation &Linear Regression
Azmi Mohd Tamil
 

Viewers also liked (20)

Simple linear regression (final)
Simple linear regression (final)Simple linear regression (final)
Simple linear regression (final)
 
Ch14
Ch14Ch14
Ch14
 
C2.1 intro
C2.1 introC2.1 intro
C2.1 intro
 
Simple Linier Regression
Simple Linier RegressionSimple Linier Regression
Simple Linier Regression
 
Lesson 8 Linear Correlation And Regression
Lesson 8 Linear Correlation And RegressionLesson 8 Linear Correlation And Regression
Lesson 8 Linear Correlation And Regression
 
Linear Regression Ordinary Least Squares Distributed Calculation Example
Linear Regression Ordinary Least Squares Distributed Calculation ExampleLinear Regression Ordinary Least Squares Distributed Calculation Example
Linear Regression Ordinary Least Squares Distributed Calculation Example
 
Simple linear regression project
Simple linear regression projectSimple linear regression project
Simple linear regression project
 
Regression: A skin-deep dive
Regression: A skin-deep diveRegression: A skin-deep dive
Regression: A skin-deep dive
 
Simple Linear Regression
Simple Linear RegressionSimple Linear Regression
Simple Linear Regression
 
Simple linear regression
Simple linear regressionSimple linear regression
Simple linear regression
 
Simple Linear Regression (simplified)
Simple Linear Regression (simplified)Simple Linear Regression (simplified)
Simple Linear Regression (simplified)
 
Linear regression
Linear regression Linear regression
Linear regression
 
Chap12 simple regression
Chap12 simple regressionChap12 simple regression
Chap12 simple regression
 
NG BB 36 Simple Linear Regression
NG BB 36 Simple Linear RegressionNG BB 36 Simple Linear Regression
NG BB 36 Simple Linear Regression
 
Linear regression without tears
Linear regression without tearsLinear regression without tears
Linear regression without tears
 
Linear regression
Linear regressionLinear regression
Linear regression
 
Linear regression
Linear regressionLinear regression
Linear regression
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
An Overview of Simple Linear Regression
An Overview of Simple Linear RegressionAn Overview of Simple Linear Regression
An Overview of Simple Linear Regression
 
Pearson Correlation, Spearman Correlation &Linear Regression
Pearson Correlation, Spearman Correlation &Linear RegressionPearson Correlation, Spearman Correlation &Linear Regression
Pearson Correlation, Spearman Correlation &Linear Regression
 

Similar to Chapter13

Chapter13
Chapter13Chapter13
Chapter13
Richard Ferreria
 
Linear regression and correlation analysis ppt @ bec doms
Linear regression and correlation analysis ppt @ bec domsLinear regression and correlation analysis ppt @ bec doms
Linear regression and correlation analysis ppt @ bec doms
Babasab Patil
 
REGRESSION ANALYSIS THEORY EXPLAINED HERE
REGRESSION ANALYSIS THEORY EXPLAINED HEREREGRESSION ANALYSIS THEORY EXPLAINED HERE
REGRESSION ANALYSIS THEORY EXPLAINED HERE
ShriramKargaonkar
 
Correlation by Neeraj Bhandari ( Surkhet.Nepal )
Correlation by Neeraj Bhandari ( Surkhet.Nepal )Correlation by Neeraj Bhandari ( Surkhet.Nepal )
Correlation by Neeraj Bhandari ( Surkhet.Nepal )Neeraj Bhandari
 
Regression.ppt basic introduction of regression with example
Regression.ppt basic introduction of regression with exampleRegression.ppt basic introduction of regression with example
Regression.ppt basic introduction of regression with example
shivshankarshiva98
 
Reg
RegReg
Correlation and Regression
Correlation and Regression Correlation and Regression
Correlation and Regression
Dr. Tushar J Bhatt
 
Corr-and-Regress (1).ppt
Corr-and-Regress (1).pptCorr-and-Regress (1).ppt
Corr-and-Regress (1).ppt
MuhammadAftab89
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
BAGARAGAZAROMUALD2
 
Cr-and-Regress.ppt
Cr-and-Regress.pptCr-and-Regress.ppt
Cr-and-Regress.ppt
RidaIrfan10
 
Correlation & Regression for Statistics Social Science
Correlation & Regression for Statistics Social ScienceCorrelation & Regression for Statistics Social Science
Correlation & Regression for Statistics Social Science
ssuser71ac73
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
HarunorRashid74
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
krunal soni
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
MoinPasha12
 
ML-UNIT-IV complete notes download here
ML-UNIT-IV  complete notes download hereML-UNIT-IV  complete notes download here
ML-UNIT-IV complete notes download here
keerthanakshatriya20
 
Regression analysis in R
Regression analysis in RRegression analysis in R
Regression analysis in R
Alichy Sowmya
 

Similar to Chapter13 (20)

Chapter13
Chapter13Chapter13
Chapter13
 
Linear regression and correlation analysis ppt @ bec doms
Linear regression and correlation analysis ppt @ bec domsLinear regression and correlation analysis ppt @ bec doms
Linear regression and correlation analysis ppt @ bec doms
 
Critical Care.pptx
Critical Care.pptxCritical Care.pptx
Critical Care.pptx
 
Chapter05
Chapter05Chapter05
Chapter05
 
Corr And Regress
Corr And RegressCorr And Regress
Corr And Regress
 
REGRESSION ANALYSIS THEORY EXPLAINED HERE
REGRESSION ANALYSIS THEORY EXPLAINED HEREREGRESSION ANALYSIS THEORY EXPLAINED HERE
REGRESSION ANALYSIS THEORY EXPLAINED HERE
 
Correlation by Neeraj Bhandari ( Surkhet.Nepal )
Correlation by Neeraj Bhandari ( Surkhet.Nepal )Correlation by Neeraj Bhandari ( Surkhet.Nepal )
Correlation by Neeraj Bhandari ( Surkhet.Nepal )
 
Regression.ppt basic introduction of regression with example
Regression.ppt basic introduction of regression with exampleRegression.ppt basic introduction of regression with example
Regression.ppt basic introduction of regression with example
 
Reg
RegReg
Reg
 
Correlation and Regression
Correlation and Regression Correlation and Regression
Correlation and Regression
 
Corr-and-Regress (1).ppt
Corr-and-Regress (1).pptCorr-and-Regress (1).ppt
Corr-and-Regress (1).ppt
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
 
Cr-and-Regress.ppt
Cr-and-Regress.pptCr-and-Regress.ppt
Cr-and-Regress.ppt
 
Correlation & Regression for Statistics Social Science
Correlation & Regression for Statistics Social ScienceCorrelation & Regression for Statistics Social Science
Correlation & Regression for Statistics Social Science
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
 
Chapter 14 Part I
Chapter 14 Part IChapter 14 Part I
Chapter 14 Part I
 
ML-UNIT-IV complete notes download here
ML-UNIT-IV  complete notes download hereML-UNIT-IV  complete notes download here
ML-UNIT-IV complete notes download here
 
Regression analysis in R
Regression analysis in RRegression analysis in R
Regression analysis in R
 

More from rwmiller

More from rwmiller (17)

Chapter06
Chapter06Chapter06
Chapter06
 
Chapter14
Chapter14Chapter14
Chapter14
 
Chapter12
Chapter12Chapter12
Chapter12
 
Chapter11
Chapter11Chapter11
Chapter11
 
Chapter10
Chapter10Chapter10
Chapter10
 
Chapter09
Chapter09Chapter09
Chapter09
 
Chapter08
Chapter08Chapter08
Chapter08
 
Chapter07
Chapter07Chapter07
Chapter07
 
Chapter04
Chapter04Chapter04
Chapter04
 
Chapter03
Chapter03Chapter03
Chapter03
 
Chapter02
Chapter02Chapter02
Chapter02
 
Chapter15
Chapter15Chapter15
Chapter15
 
Chapter01
Chapter01Chapter01
Chapter01
 
Chapter04
Chapter04Chapter04
Chapter04
 
Chapter03
Chapter03Chapter03
Chapter03
 
Chapter02
Chapter02Chapter02
Chapter02
 
Chapter01
Chapter01Chapter01
Chapter01
 

Recently uploaded

LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
CatarinaPereira64715
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 

Recently uploaded (20)

LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 

Chapter13

  • 1. Chapter 13 Simple Linear Regression & Correlation Inferential Methods
  • 2.
  • 3.
  • 4. Probabilistic Models Deviations from the deterministic part of a probabilistic model e=-1.5
  • 5. Simple Linear Regression Model The simple linear regression model assumes that there is a line with vertical or y intercept a and slope b, called the true or population regression line. When a value of the independent variable x is fixed and an observation on the dependent variable y is made, y =  +  x + e Without the random deviation e , all observed points (x, y) points would fall exactly on the population regression line. The inclusion of e in the model equation allows points to deviate from the line by random amounts.
  • 6. Simple Linear Regression Model 0 0 x = x 1 x = x 2 e 2 Observation when x = x 1 (positive deviation) e 2 Observation when x = x 2 (positive deviation)  = vertical intercept Population regression line (Slope  )
  • 7.
  • 8. More About the Simple Linear Regression Model and (standard deviation of y for fixed x) =  . For any fixed x value, y itself has a normal distribution.
  • 9.
  • 11. Estimates for the Regression Line The point estimates of  , the slope, and  , the y intercept of the population regression line, are the slope and y intercept, respectively, of the least squares line. That is,
  • 12.
  • 13. Example The following data was collected in a study of age and fatness in humans. * Mazess, R.B., Peppler, W.W., and Gibbons, M. (1984) Total body composition by dual-photon ( 153 Gd) absorptiometry. American Journal of Clinical Nutrition , 40 , 834-839 One of the questions was, “What is the relationship between age and fatness?”
  • 17. Example A point estimate for the %Fat for a human who is 45 years old is If 45 is put into the equation for x, we have both an estimated %Fat for a 45 year old human or an estimated average %Fat for 45 year old humans The two interpretations are quite different.
  • 18. Example A plot of the data points along with the least squares regression line created with Minitab is given to the right.
  • 20. Definition formulae The total sum of squares , denoted by SSTo , is defined as The residual sum of squares , denoted by SSResid , is defined as
  • 21. Calculation Formulae Recalled SSTo and SSResid are generally found as part of the standard output from most statistical packages or can be obtained using the following computational formulas:
  • 22.
  • 23. Estimated Standard Deviation, s e The statistic for estimating the variance  2 is where
  • 24. Estimated Standard Deviation, s e The estimate of  is the estimated standard deviation The number of degrees of freedom associated with estimating   or  in simple linear regression is n - 2.
  • 27. Example continued With r 2 = 0.627 or 62.7%, we can say that 62.7% of the observed variation in %Fat can be attributed to the probabilistic linear relationship with human age. The magnitude of a typical sample deviation from the least squares line is about 5.75(%) which is reasonably large compared to the y values themselves. This would suggest that the model is only useful in the sense of provide gross “ballpark” estimates for %Fat for humans based on age.
  • 28.
  • 29. Estimated Standard Deviation of b The estimated standard deviation of the statistic b is When then four basic assumptions of the simple linear regression model are satisfied, the probability distribution of the standardized variable is the t distribution with df = n - 2
  • 30. Confidence interval for  When then four basic assumptions of the simple linear regression model are satisfied, a confidence interval for  , the slope of the population regression line, has the form b  (t critical value)  s b where the t critical value is based on df = n - 2.
  • 31. Example continued Recall A 95% confidence interval estimate for  is
  • 32. Example continued Based on sample data, we are 95% confident that the true mean increase in %Fat associated with a year of age is between 0.324% and 0.772%. A 95% confidence interval estimate for  is
  • 33. Example continued The regression equation is % Fat y = 3.22 + 0.548 Age (x) Predictor Coef SE Coef T P Constant 3.221 5.076 0.63 0.535 Age (x) 0.5480 0.1056 5.19 0.000 S = 5.754 R-Sq = 62.7% R-Sq(adj) = 60.4% Analysis of Variance Source DF SS MS F P Regression 1 891.87 891.87 26.94 0.000 Residual Error 16 529.66 33.10 Total 17 1421.54 Minitab output looks like Regression Analysis: % Fat y versus Age (x) Regression line residual df = n -2 SSResid SSTo Estimated slope b Estimated y intercept a
  • 34. Hypothesis Tests Concerning  Null hypothesis: H 0 :  = hypothesized value
  • 35.
  • 36.
  • 37.
  • 38. Hypothesis Tests Concerning  Quite often the test is performed with the hypotheses H 0 :  = 0 vs. H a :   0 This particular form of the test is called the model utility test for simple linear regression. The null hypothesis specifies that there is no useful linear relationship between x and y, whereas the alternative hypothesis specifies that there is a useful linear relationship between x and y. The test statistic simplifies to and is called the t ratio .
  • 39. Example Consider the following data on percentage unemployment and suicide rates. * Smith, D. (1977) Patterns in Human Geography , Canada: Douglas David and Charles Ltd., 158.
  • 40. Example The plot of the data points produced by Minitab follows
  • 42. Example Some basic summary statistics
  • 43. Example Continuing with the calculations
  • 44. Example Continuing with the calculations
  • 46.
  • 47.
  • 48.
  • 49.
  • 50. Example - Minitab Output Regression Analysis: Suicide Rate (y) versus Percentage Unemployed (x) The regression equation is Suicide Rate (y) = - 93.9 + 59.1 Percentage Unemployed (x) Predictor Coef SE Coef T P Constant -93.86 51.25 -1.83 0.100 Percenta 59.05 14.24 4.15 0.002 S = 36.06 R-Sq = 65.7% R-Sq(adj) = 61.8% T value for Model Utility Test H 0 :  = 0 H a :   0 P-value
  • 51.
  • 52. Residual Analysis To check on these assumptions, one would examine the deviations e 1 , e 2 , …, e n . Generally, the deviations are not known, so we check on the assumptions by looking at the residuals which are the deviations from the estimated line, a + bx. The residuals are given by
  • 53. Standardized Residuals Recall: A quantity is standardized by subtracting its mean value and then dividing by its true (or estimated) standard deviation. For the residuals, the true mean is zero (0) if the assumptions are true. The estimated standard deviation of a residual depends on the x value. The estimated standard deviation of the i th residual, , is given by
  • 54. Standardized Residuals As you can see from the formula for the estimated standard deviation the calculation of the standardized residuals is a bit of a calculational nightmare. Fortunately, most statistical software packages are set up to perform these calculations and do so quite proficiently.
  • 55. Standardized Residuals - Example Consider the data on percentage unemployment and suicide rates Notice that the standardized residual for Pittsburgh is -2.50, somewhat large for this size data set.
  • 56. Example Pittsburgh This point has an unusually high residual
  • 57. Normal Plots Notice that both of the normal plots look similar. If a software package is available to do the calculation and plots, it is preferable to look at the normal plot of the standardized residuals. In both cases, the points look reasonable linear with the possible exception of Pittsburgh, so the assumption that the errors are normally distributed seems to be supported by the sample data.
  • 58. More Comments The fact that Pittsburgh has a large standardized residual makes it worthwhile to look at that city carefully to make sure the figures were reported correctly. One might also look to see if there are some reasons that Pittsburgh should be looked at separately because some other characteristic distinguishes it from all of the other cities. Pittsburgh does have a large effect on model.
  • 59. Visual Interpretation of Standardized Residuals This plot is an example of a satisfactory plot that indicates that the model assumptions are reasonable.
  • 60. Visual Interpretation of Standardized Residuals This plot suggests that a curvilinear regression model is needed.
  • 61. Visual Interpretation of Standardized Residuals This plot suggests a non-constant variance. The assumptions of the model are not correct.
  • 62. Visual Interpretation of Standardized Residuals This plot shows a data point with a large standardized residual.
  • 63. Visual Interpretation of Standardized Residuals This plot shows a potentially influential observation.
  • 64. Example - % Unemployment vs. Suicide Rate This plot of the residuals (errors) indicates some possible problems with this linear model. You can see a pattern to the points. Generally decreasing pattern to these points. Unusually large residual These two points are quite influential since they are far away from the others in terms of the % unemployed
  • 65.
  • 66.
  • 67. Addition Information about the Sampling Distribution of a + bx for a Fixed x Value The estimated standard deviation of the statistic a + bx*, denoted by s a+bx* , is given by When the four basic assumptions of the simple linear regression model are satisfied, the probability distribution of the standardized variable is the t distribution with df = n - 2.
  • 68. Confidence Interval for a Mean y Value When the four basic assumptions of the simple linear regression model are met, a confidence interval for a + bx* , the average y value when x has the value x*, is a + bx*  (t critical value)s a+bx* Where the t critical value is based on df = n -2. Many authors give the following equivalent form for the confidence interval.
  • 69. Confidence Interval for a Single y Value When the four basic assumptions of the simple linear regression model are met, a prediction interval for y* , a single y observation made when x has the value x*, has the form Where the t critical value is based on df = n -2. Many authors give the following equivalent form for the prediction interval.
  • 70. Example - Mean Annual Temperature vs. Mortality Data was collected in certain regions of Great Britain, Norway and Sweden to study the relationship between the mean annual temperature and the mortality rate for a specific type of breast cancer in women. * Lea, A.J. (1965) New Observations on distribution of neoplasms of female breast in certain European countries. British Medical Journal , 1 , 488-490
  • 71. Example - Mean Annual Temperature vs. Mortality Regression Analysis: Mortality index versus Mean annual temperature   The regression equation is Mortality index = - 21.8 + 2.36 Mean annual temperature   Predictor Coef SE Coef T P Constant -21.79 15.67 -1.39 0.186 Mean ann 2.3577 0.3489 6.76 0.000   S = 7.545 R-Sq = 76.5% R-Sq(adj) = 74.9%   Analysis of Variance   Source DF SS MS F P Regression 1 2599.5 2599.5 45.67 0.000 Residual Error 14 796.9 56.9 Total 15 3396.4   Unusual Observations Obs Mean ann Mortalit Fit SE Fit Residual St Resid 15 31.8 67.30 53.18 4.85 14.12 2.44RX   R denotes an observation with a large standardized residual X denotes an observation whose X value gives it large influence.
  • 72. Example - Mean Annual Temperature vs. Mortality The point has a large standardized residual and is influential because of the low Mean Annual Temperature.
  • 73. Example - Mean Annual Temperature vs. Mortality These are the x* values for which the above fits, standard errors of the fits, 95% confidence intervals for Mean y values and prediction intervals for y values given above. Predicted Values for New Observations   New Obs Fit SE Fit 95.0% CI 95.0% PI 1 53.18 4.85 ( 42.79, 63.57) ( 33.95, 72.41) X 2 60.72 3.84 ( 52.48, 68.96) ( 42.57, 78.88) 3 72.51 2.48 ( 67.20, 77.82) ( 55.48, 89.54) 4 83.34 1.89 ( 79.30, 87.39) ( 66.66, 100.02) 5 96.09 2.67 ( 90.37, 101.81) ( 78.93, 113.25) 6 99.16 3.01 ( 92.71, 105.60) ( 81.74, 116.57) X denotes a row with X values away from the center   Values of Predictors for New Observations   New Obs Mean ann 1 31.8 2 35.0 3 40.0 4 44.6 5 50.0 6 51.3
  • 74. Example - Mean Annual Temperature vs. Mortality 95% prediction interval for single y value at x = 45. (67.62,100.98) 95% confidence interval for Mean y value at x = 40. (67.20, 77.82)
  • 75. A Test for Independence in a Bivariate Normal Population Null hypothesis: H 0 :  = 0 Assumption: r is the correlation coefficient for a random sample from a bivariate normal population. Test statistic: The t critical value is based on df = n - 2
  • 76.
  • 77. Example Recall the data from the study of %Fat vs. Age for humans. There are 18 data points and a quick calculation of the Pierson correlation coefficient gives r = 0.79209. We will test to see if there is a dependence at the 0.05 significance level.
  • 78.
  • 79.
  • 80. Another Example Height vs. Joint Length The professor in an elementary statistics class wanted to explain correlation so he needed some bivariate data. He asked his class (presumably a random or representative sample of late adolescent humans) to measure the length of the metacarpal bone on the index finger of the right hand (in cm) and height (in ft). The data are provided on the next slide.
  • 81. Example - Height vs. Joint Length There are 17 data points and a quick calculation of the Pierson correlation coefficient gives r = 0.74908. We will test to see if the true population correlation coefficient is positive at the 0.05 level of significance.
  • 82.
  • 83.
  • 84.