Regression

2,604 views

Published on

Published in: Education, Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,604
On SlideShare
0
From Embeds
0
Number of Embeds
45
Actions
Shares
0
Downloads
114
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Regression

  1. 1. UNIT III REGRESSION
  2. 2. Meaning  The dictionary meaning of regression is “the act of returning or going back”;  First used in 1877 by Francis Galton;  Regression is the statistical tool with the help of which we are in a position to estimate (predict) the unknown values of one variable from the known values of another variable;  It helps to find out average probable change in one variable given a certain amount of change in another;
  3. 3. Importance
  4. 4. Regression lines  For two variables X and Y, we will have two regression lines: 1. Regression line X on Y gives values of Y for given values of X; 2. Regression line Y on X gives values of X for given values of Y;
  5. 5. Regression Equation  Regression equations are algebraic expressions of regression lines; Y on X Regression equation expressed as Y=a+bX Y is dependent variable X is independent variable „a‟ & „b‟ are constants/parameters of line „a‟ determines the level of fitted line (i.e. distance of line above or below origin) „b‟ determines the slope of line (i.e change in Y for unit change in X)
  6. 6.  Regression equations are algebraic expressions of regression lines; X on Y Regression equation expressed as X=a+bY X is dependent variable Y is independent variable „a‟ & „b‟ are constants/parameters of line „a‟ determines the level of fitted line (i.e. distance of line above or below origin) „b‟ determines the slope of line (i.e change in Y for unit change in X)
  7. 7. Method of Least Square  Constant “a” & “b” can be calculated by method of least square;  The line should be drawn through the plotted points in such a manner that the sum of square of the vertical deviations of actual Y values from estimated Y values is the least i.e. ∑(Y-Ye)2 should be minimum;  Such a line is known as line of best fit;  with algebra & calculus: For Y on X For X on Y ∑Y=Na+b ∑X ∑X=Na+b ∑Y ∑XY=a ∑X + b ∑X2 ∑XY=a ∑Y + b ∑Y2
  8. 8. Multiple Regression  When we use more than one independent variable to estimate the dependent variable in order to increase the accuracy of the estimate; the process is called multiple regression analysis.  It is based on the same assumptions & procedure that are encountered using simple regression.  The principal advantage of multiple regression is that it allows us to use more of the information available to us to estimate the
  9. 9. Estimating equation describing relationship among three variables Y= a+b1X1+b2X2  where, Y = estimated value corresponding to the dependent variable  a= Y intercept  b1 and b2 = slopes associated with X1 and X2, respectively  X1 and X2 = values of the two independent variables
  10. 10. Normal Equations:  we use three equations (which statistician call the “normal equation”) to determine the values of the constants a, b1 and b2  ∑Y=Na+b1∑X1 + b2∑X2  ∑X1Y=a ∑X1 + b1 ∑X1 2 + b2∑X1 X2  ∑X2Y=a ∑X2 + b2 ∑X2 2 + b1∑X1 X2
  11. 11. Difference between regression & correlation  Correlation coefficient (r) between x & y is a measure of direction & degree of linear relationship between x & y;  It does not imply cause & effect relationship between the variables.  It indicates the degree of association  bxy & byx are mathematical measures expressing the average relationship between the two variables  It indicates the cause & effect relationship between variables.  It is used to forecast the nature of dependent variable when the value of independent variable is Correlation Regression

×