SlideShare a Scribd company logo
1 of 42
Download to read offline
St. Xavier’s College of Management
and Technology
REGRESSION
ANALYSIS
A PRESENTATION BY
SIMRAN JEET KAUR
SONAL PRABHAT
SRIKANT PANDEY
SUPRIYA
ANMOL
INTRODUCTION
• According to Oxford dictionary the word ‘regression’ means
‘stepping back’ or ‘returning to average value’.
• The term was first used by famous biological scientist Sir
Francis Galton relating to a study of hereditary characteristics.
• Regression Analysis is a statistical process for estimating the
relationship among variables, so that one may be able to
predict the unknown value of one variable for a known value of
another variable.
CONTENTS
• What is Regression Analysis
• Utility of Regression
• Difference between Regression and Correlation
• Types of Regression
• Simple, Multiple, Linear, Non-linear, Partial, Total
• Scatter Diagrams and Relationships
• Estimation using a Regression Lines (for linear only)
• The Method of Least Squares
• A Solved Example
• Correlation Analysis
• Using Regression and Correlation Analysis: Limitations, Errors
and Caveats
• Conclusion
WHAT IS
REGRESSION?
The process of predicting one variable from another by
statistical means, using previous data (regression).
Regression Line– A line fitted to a set of data points to
estimate the relationship between two variables.
Regression Analysis attempts to create an estimating
equation for prediction of data (by using previous data).
UTILITY OF
REGRESSION
1. Regression Analysis explains the nature of relationship
between two variables.
2. It is one of the most commonly used tool for business
analysis.
3. It is widely useful for quality control in corporate sector.
4. It is useful for estimation of statistical curve for demand,
supply, price consumption and cost.
DIFFERENCE BETWEEN
REGRESSION AND
CORELATION ANALYISIS
CO-RELATION
1. Correlation is a relationship
between two or more
variables.
2. It is the measure of degree of
relationship between two
variables.
3. The coefficient of correlation
is relative measure. The
range lies between -1 and +1.
4. It is not very much useful for
further algebraic treatment
5. Correlation coefficient is
independent of change in
origin and scale.
REGRESSION
1. It is mathematical relation
showing the average relation
between two variables.
2. It is the measure of nature of
relationship between
variables.
3. Regression coefficient is an
absolute figure. Value of an
dependent variable is found
from an independent variable.
4. It is very much useful for
further algebraic treatment.
5. Regression coefficient is
independent of the change of
origin but not the scale.
USE IN BUSINESS
Regression is widely used in the field of business.
Businessmen are interested in predicting future production, consumption,
investments, price, profit, sales etc. So the success of business depends
upon the correctness of the various estimates that they are required to
make.
It is also used in sociological study and economical planning to find the
projections of population, birth rates, death rates etc.
TYPES OF REGRESSION
ANALYSIS
 Simple and Multiple Regression
 Linear and Non-Linear Regression
 Partial and Total Regression
SIMPLE REGRESSION
MULTIPLE REGRESSION
In Simple Regression Analysis we study about only two variables
at a time in which one variable is dependent and other is
independent.
eg. the functional relationship between income and expenditure
In Multiple Regression Analysis we study about multiple variables,
among which one is dependent and others are independent.
eg. the study of effect of rain and irrigation on the yield of wheat is
an example of multiple regression.
LINEAR REGRESSION
NON-LINEAR REGRESSION
When one variable changes with another variable in a fixed ratio,
it is known as linear regression.
Such type of relationship is depicted on graph by straight line or
first degree equation.
When one variable changes with another variable in a changing
ration then it is referred to as curvilinear/non-linear regression.
Such type of relationship on a graph paper takes the form of a
curve. This is presented by second or third degree equation.
PARTIAL REGRESSION
TOTAL REGRESSION
When two or more variables are studied for functional relationship
but at a time, relationship between two variables are studied and
other variables are held constant.
And on the other hand when we study about all variables at a
time, then it is called Total Regression
SCATTER DIAGRAM
AND RELATIONS
• Determine whether a relationship between two variable exists
or not
• Defines a very basic nature of relationship between two
variables— independent (X) and dependent (Y) variable
• Helps us to draw a regression line ( a line fitted between the
scatter points to derive a relation between two variable)
• An important note: Regression Line is attempted to be drawn
such that most of the points lie on it and equal number of
points fall on either side of it.
0
20
40
60
80
0 50 100
Graph 3
Y-Values
0
1
2
3
4
0 50 100
Graph 1
Y-Values
0
20
40
60
80
0 20 40
Graph 2
Y-Values
Inverse Linear with more
scattering
Direct Curvilinear
Inverse Curvilinear
Estimation Using Regression Line
0, 3
1, 5
2, 7
0
2
4
6
8
0 2 4
Moneyspent(inK)
No. of stores visited
Mrs. Hudson’s Shopping
Y-Values
Estimating equation for a straight line:
Y = a + bX
where,
Y = Dependent variable
a = Y intercept (constant)
b = slope of the line (constant)
X = Independent variable
Mrs. Hudson wants to determine how much money
she will end up spending if she visits 5 stores. (Let b
be 2)
By graph: a = 3
b = 2
X = 10
Therefore: Y = a + bX
Y = 3 + 2(5)
Y = 13
Therefore Mrs. Hudson comes to know she
will probably spend 13K when she’ll visit 5
stores.
Finding the Slope of Straight Line
0, 3
1, 5
2, 7
0
2
4
6
8
0 2 4
Moneyspent(inK)
No. of stores visited
Mrs. Hudson’s Shopping
Y-Values
Slope of a straight line:
b =
𝒀2−𝒀1
𝑿2−𝑿1
Let us try determining the slope for Mrs. Hudson’s
Shopping graph.
Therefore,
b =
𝒀2−𝒀1
𝑿2−𝑿1
b =
7−5
2−1
b =
2
1
= 2
2 is the Slope of line
METHOD OF LEAST
SQUARE
• How can we a “fit” a line mathematically if none of the points lie on
the line?
• We shall determine how to determine equation for a line drawn
through set points.
• The Estimating Line Equation:
• Ŷ symbolizes the estimated points, or the points that lie on the
regression line.
• Let’s take an example.
𝑌 = 𝑎 + 𝑏𝑋
DEMONSTRATION OF
USE OF LEAST SQUARE
METHOD
6
5
4
Error= 2
Error= -4
Error= 2
0
1
2
3
4
5
6
7
8
9
0 2 4 6 8 10 12 14
(A)
Ŷ Y- Values
6
2
5
8
Error= 6
Error= -4
Error= -2
0
1
2
3
4
5
6
7
8
9
0 2 4 6 8 10 12 14
(B)
Ŷ Y-Values
8
1
8
1
6
Using Absolute
Values
Vector Values
Summing the
Errors of the
Estimating
Line
│Y – Ŷ│ Y – Ŷ
│8 – 6│ = 2 8 – 6 = 2
│1 – 5│ = 4 1 – 5 = -4
│6 – 4│ = 2 6 – 4 = 2
Total = 8
(Absl. Error)
= 0
(Total Error)
Using Absolute
Values
Vector Values
Summing the
Errors of the
Estimating
Line
│Y – Ŷ│ Y – Ŷ
│8 – 2│ = 6 8 – 2 = 6
│1 – 5│ = 4 1 – 5 = -4
│6 – 8│ = 2 6 – 8 = -2
Total = 12
(Absl. Error)
= 0
(Total Error)
2
5
8
Error= 6
Error= -4
Error= -2
0
1
2
3
4
5
6
7
8
9
0 2 4 6 8 10 12 14
(B)
Ŷ Y-Values
8
1
6
(D) IS A BETTER FIT,
BUT…
4
2
7
4
2
3
0
2
4
6
8
0 10 20
(C)
Actual Points
Est. Points
Error= 4
4
2
7
5
3
4
0
2
4
6
8
0 10 20
(D)
Actual
Points
Est. Points
Linear (Est.
Points)
The
Absolute
Error
│Y – Ŷ│
│4 – 4│ = 0
│7 – 3│ = 4
│2 – 2│ = 0
Total = 4
The
Absolute
Error
│Y – Ŷ│
│5 – 4│ = 1
│7 – 4│ = 3
│3 – 2│ = 1
Total = 5
The Value of Absolute error
in D is high, so only absolute
error is not reliable to find
accurate regression line,
which is why we shall use….
THE METHOD OF
LEAST SQUARES
4
2
7
4
2
3
0
2
4
6
8
0 10 20
(C)
Actual Points
Est. Points
Error= 4
4
2
7
5
3
4
0
2
4
6
8
0 10 20
(D)
Actual
Points
Est. Points
Linear (Est.
Points)
The
Absolute
Error
(Y – Ŷ) 2
(4 – 4) 2 = 0
(7 – 3) 2 = (4) 2 = 16
(2 – 2) 2 = 0
Total = 16
The
Absolute
Error
(Y – Ŷ) 2
(5 – 4) 2 = 1
(7 – 4) 2 = (3)2 = 9
(3 – 2) 2 = 1
Total = 11
THE METHOD OF
LEAST SQUARE
What does it do?
• It magnifies or penalizes the larger errors
• It cancels the effect of the positive and negative values (no
need for mod operator)
• Because we look for the estimating line that minimizes the sum
of the squares of the errors we call this the least squares
method.
• Now let us take a look at how to calculate the constants for the
line estimated in such a way
(Slope of the best-fitting regression line):
b =
∑𝑋𝑌 − 𝑛𝑋 𝑌
∑𝑋2− 𝑛𝑋2
where,
𝑋 = mean of values of the independent variable
𝑌 = mean of values of the dependent variable
n = the number of data points
X = values of the independent variable
Y = values of the dependent variable
Y- Intercept of the best-fitting regression line:
a = 𝑌 − 𝑏𝑋
where,
𝑋 = mean of values of the independent variable
𝑌 = mean of values of the dependent variable
b = slope from previous equation
Truck No. Age of
Truck (X)
Repair
Expense (Y)
XY, so ∑XY=
78
101 5 7 35
102 3 7 21
103 3 6 18
104 1 4 4
eg. taking the Data beside:
𝑿 = 3
𝒀 = 6
n = 4 (data points)
A SOLVED
EXAMPLE
USING THE LEAST SQUARE METHOD AND
DEVIATION FROM THE ARITHMETIC MEAN
METHOD
The Director of the Beverly Hills Sanitation Department is interested
in the relationship between the age of a garbage truck and the
repairing expenses she should expect to incur.
So data is gathered by analysts on 5 trucks and arranged and is
presented as:
Truck No. Age of Truck (X) Repair Expenses (Y) (in 100$)
101 3 6
102 2 1
103 7 8
104 4 5
105 8 9
ARRANGE THE DATA
Trucks
(n=4)
Age
(X)
Repair
Expense
(Y)
XY X2
Y2
101 3 6 18 9 36
102 2 1 2 4 1
103 7 8 56 49 64
104 4 5 20 16 25
105 8 9 72 64 81
∑X = 24 ∑Y = 29 ∑XY = 168 ∑X2 = 142 ∑Y2 = 207
FIRST METHOD (1)
• Find the numerical constant b (slope of line):
b =
∑𝑋𝑌 − 𝑛𝑋 𝑌
∑𝑋2− 𝑛𝑋2
• Substitute values (𝑋 = 4.8) 𝑌 = 5.8 :
b =
168 −5 4.8 5.8
142−5 4.8 2
b =
168 −139.2
142−23.04
b =
28.8
26.8
b = 𝟏. 𝟎𝟕
• Find the numerical constant b (slope of line):
a = 𝑌 − 𝑏𝑋
• Substitute values (𝑋 = 4.8) 𝑌 = 5.8 :
a = 5.8 − 1.07 4.8
a = 5.8 − 5.136
a = 𝟎. 𝟔𝟔
Regression Line equation (substitute the values):
𝑌 = 𝑎 + 𝑏𝑋
𝒀 = 𝟎. 𝟔𝟔 + 𝟏. 𝟎𝟕𝑿
ARRANGE THE DATA
(*REPEATED)
Trucks
(n=4)
Age
(X)
Repair
Expense
(Y)
XY X2
Y2
101 3 6 18 9 36
102 2 1 2 4 1
103 7 8 56 49 64
104 4 5 20 16 25
105 8 9 72 64 81
∑X = 24 ∑Y = 29 ∑XY = 168 ∑X2 = 142 ∑Y2 = 207
SECOND METHOD (2)
Normal equation to find ‘a’ is:
∑ 𝑌 = 𝑛𝑎 + 𝑏 ∑ 𝑋
Normal equation to find ‘b’ is:
∑ 𝑋𝑌 = 𝑎 ∑ 𝑋 + 𝑏 ∑ 𝑋2
Substituting values from the table, we get:
29 = 5𝑎 + 24𝑏……………(i)
168 = 24𝑎 + 142𝑏
84 = 12𝑎 + 71𝑏………….(ii)
Multiplying equation (i) by 12 and (ii) by 5
(when multiplied by 12 and 5 respectively)
348 = 60𝑎 + 288𝑏 ……………(iii)
420 = 60𝑎 + 355𝑏 ……………(iv)
By solving equation (iii) and (iv), we get:
𝒂 = 𝟎. 𝟔𝟔
𝒃 = 𝟏. 𝟎𝟕
The Estimating Equation becomes:
𝒀 = 𝟎. 𝟔𝟔 + 𝟏. 𝟎𝟕𝑿
END OF
PROBLEM
EXTRA:
CORRELATION ANALYSIS
After discovering the type of relationship between two events,
what may be the next step?
Correlation analysis helps us to know to what degree are both the
variables related.
It helps us to determine the extent of reliability through some
determination coefficients, coefficient of correlation etc.
Since this topic is out of scope, we must not discuss it. The next
steps however are correlation analysis and multiple regression.
REGRESSION AND CORRELATION
ANALYSIS: LIMITATIONS, ERRORS
AND CAVEATS
• Extrapolation beyond the Range of the Observed Data
An estimating equation is only valid over the same range as the one from
which the sample was taken initially.
• Cause and Effect
Regression and Correlation analysis do not state that a change in one
variable causes change in another variable.
• Using Past Trends to Estimate Future Trends
Reappraisal of historical data must be done if we use it to determine
estimating equations for present.
• Finding Relationships when they do not Exist
It takes knowledge and common-sense to deduce which relationships are
meaningful and meaningless.
CONCLUSION
Thus we realise why Regression Analysis is so prominent in use in
all kinds of fields, and not just business.
Both Regression and Correlation Analysis are powerful tools for
prediction of outcomes and forecasting of data, by using the
previously collected samples.
Regression Analysis obviously has its limitations and should be
used with discretion of the user.
END
Questions
THANK YOU

More Related Content

What's hot

Statistical Estimation
Statistical Estimation Statistical Estimation
Statistical Estimation Remyagharishs
 
Regression Analysis
Regression AnalysisRegression Analysis
Regression AnalysisSalim Azad
 
Correlation and regression analysis
Correlation and regression analysisCorrelation and regression analysis
Correlation and regression analysis_pem
 
Multiple Correlation - Thiyagu
Multiple Correlation - ThiyaguMultiple Correlation - Thiyagu
Multiple Correlation - ThiyaguThiyagu K
 
Regression analysis
Regression analysisRegression analysis
Regression analysissaba khan
 
Simple Linier Regression
Simple Linier RegressionSimple Linier Regression
Simple Linier Regressiondessybudiyanti
 
Probability Theory
Probability TheoryProbability Theory
Probability TheoryParul Singh
 
Regression Analysis presentation by Al Arizmendez and Cathryn Lottier
Regression Analysis presentation by Al Arizmendez and Cathryn LottierRegression Analysis presentation by Al Arizmendez and Cathryn Lottier
Regression Analysis presentation by Al Arizmendez and Cathryn LottierAl Arizmendez
 
Auto Correlation Presentation
Auto Correlation PresentationAuto Correlation Presentation
Auto Correlation PresentationIrfan Hussain
 

What's hot (20)

Statistical Estimation
Statistical Estimation Statistical Estimation
Statistical Estimation
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Regression Analysis
Regression AnalysisRegression Analysis
Regression Analysis
 
Correlation and regression analysis
Correlation and regression analysisCorrelation and regression analysis
Correlation and regression analysis
 
Regression ppt
Regression pptRegression ppt
Regression ppt
 
Multiple Correlation - Thiyagu
Multiple Correlation - ThiyaguMultiple Correlation - Thiyagu
Multiple Correlation - Thiyagu
 
Autocorrelation
AutocorrelationAutocorrelation
Autocorrelation
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Heteroscedasticity
HeteroscedasticityHeteroscedasticity
Heteroscedasticity
 
Simple Linier Regression
Simple Linier RegressionSimple Linier Regression
Simple Linier Regression
 
Correlation and Regression
Correlation and RegressionCorrelation and Regression
Correlation and Regression
 
Probability Theory
Probability TheoryProbability Theory
Probability Theory
 
Ols by hiron
Ols by hironOls by hiron
Ols by hiron
 
Regression Analysis presentation by Al Arizmendez and Cathryn Lottier
Regression Analysis presentation by Al Arizmendez and Cathryn LottierRegression Analysis presentation by Al Arizmendez and Cathryn Lottier
Regression Analysis presentation by Al Arizmendez and Cathryn Lottier
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Auto Correlation Presentation
Auto Correlation PresentationAuto Correlation Presentation
Auto Correlation Presentation
 
Regression
RegressionRegression
Regression
 
Multiple Regression Analysis (MRA)
Multiple Regression Analysis (MRA)Multiple Regression Analysis (MRA)
Multiple Regression Analysis (MRA)
 
Correlation analysis
Correlation analysisCorrelation analysis
Correlation analysis
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 

Similar to Regression analysis

Unit-III Correlation and Regression.pptx
Unit-III Correlation and Regression.pptxUnit-III Correlation and Regression.pptx
Unit-III Correlation and Regression.pptxAnusuya123
 
Bba 3274 qm week 6 part 1 regression models
Bba 3274 qm week 6 part 1 regression modelsBba 3274 qm week 6 part 1 regression models
Bba 3274 qm week 6 part 1 regression modelsStephen Ong
 
Regression analysis algorithm
Regression analysis algorithm Regression analysis algorithm
Regression analysis algorithm Sammer Qader
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regressionAbdelaziz Tayoun
 
Analyzing Relations between Data Set - Part I
Analyzing Relations between Data Set - Part IAnalyzing Relations between Data Set - Part I
Analyzing Relations between Data Set - Part INaseha Sameen
 
Stat 1163 -correlation and regression
Stat 1163 -correlation and regressionStat 1163 -correlation and regression
Stat 1163 -correlation and regressionKhulna University
 
correlationcoefficient-20090414 0531.pdf
correlationcoefficient-20090414 0531.pdfcorrelationcoefficient-20090414 0531.pdf
correlationcoefficient-20090414 0531.pdfDrAmanSaxena
 
Simple Correlation : Karl Pearson’s Correlation co- efficient and Spearman’s ...
Simple Correlation : Karl Pearson’s Correlation co- efficient and Spearman’s ...Simple Correlation : Karl Pearson’s Correlation co- efficient and Spearman’s ...
Simple Correlation : Karl Pearson’s Correlation co- efficient and Spearman’s ...RekhaChoudhary24
 
regression-linearandlogisitics-220524024037-4221a176 (1).pdf
regression-linearandlogisitics-220524024037-4221a176 (1).pdfregression-linearandlogisitics-220524024037-4221a176 (1).pdf
regression-linearandlogisitics-220524024037-4221a176 (1).pdflisow86669
 
Measure of Dispersion, Range, Mean and Standard Deviation, Correlation and Re...
Measure of Dispersion, Range, Mean and Standard Deviation, Correlation and Re...Measure of Dispersion, Range, Mean and Standard Deviation, Correlation and Re...
Measure of Dispersion, Range, Mean and Standard Deviation, Correlation and Re...Parth Chuahan
 

Similar to Regression analysis (20)

Unit-III Correlation and Regression.pptx
Unit-III Correlation and Regression.pptxUnit-III Correlation and Regression.pptx
Unit-III Correlation and Regression.pptx
 
Bba 3274 qm week 6 part 1 regression models
Bba 3274 qm week 6 part 1 regression modelsBba 3274 qm week 6 part 1 regression models
Bba 3274 qm week 6 part 1 regression models
 
Rsh qam11 ch04 ge
Rsh qam11 ch04 geRsh qam11 ch04 ge
Rsh qam11 ch04 ge
 
Simple Linear Regression.pptx
Simple Linear Regression.pptxSimple Linear Regression.pptx
Simple Linear Regression.pptx
 
Simple egression.pptx
Simple egression.pptxSimple egression.pptx
Simple egression.pptx
 
Regression analysis algorithm
Regression analysis algorithm Regression analysis algorithm
Regression analysis algorithm
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regression
 
Course pack unit 5
Course pack unit 5Course pack unit 5
Course pack unit 5
 
Statistics for entrepreneurs
Statistics for entrepreneurs Statistics for entrepreneurs
Statistics for entrepreneurs
 
Analyzing Relations between Data Set - Part I
Analyzing Relations between Data Set - Part IAnalyzing Relations between Data Set - Part I
Analyzing Relations between Data Set - Part I
 
Regression
RegressionRegression
Regression
 
Stat 1163 -correlation and regression
Stat 1163 -correlation and regressionStat 1163 -correlation and regression
Stat 1163 -correlation and regression
 
correlationcoefficient-20090414 0531.pdf
correlationcoefficient-20090414 0531.pdfcorrelationcoefficient-20090414 0531.pdf
correlationcoefficient-20090414 0531.pdf
 
Simple Correlation : Karl Pearson’s Correlation co- efficient and Spearman’s ...
Simple Correlation : Karl Pearson’s Correlation co- efficient and Spearman’s ...Simple Correlation : Karl Pearson’s Correlation co- efficient and Spearman’s ...
Simple Correlation : Karl Pearson’s Correlation co- efficient and Spearman’s ...
 
regression-linearandlogisitics-220524024037-4221a176 (1).pdf
regression-linearandlogisitics-220524024037-4221a176 (1).pdfregression-linearandlogisitics-220524024037-4221a176 (1).pdf
regression-linearandlogisitics-220524024037-4221a176 (1).pdf
 
Linear and Logistics Regression
Linear and Logistics RegressionLinear and Logistics Regression
Linear and Logistics Regression
 
Regression Analysis
Regression AnalysisRegression Analysis
Regression Analysis
 
Regression Analysis
Regression AnalysisRegression Analysis
Regression Analysis
 
Measure of Dispersion, Range, Mean and Standard Deviation, Correlation and Re...
Measure of Dispersion, Range, Mean and Standard Deviation, Correlation and Re...Measure of Dispersion, Range, Mean and Standard Deviation, Correlation and Re...
Measure of Dispersion, Range, Mean and Standard Deviation, Correlation and Re...
 
Correlation
CorrelationCorrelation
Correlation
 

Recently uploaded

ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
internship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerinternship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerunnathinaik
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,Virag Sontakke
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxEyham Joco
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxAvyJaneVismanos
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 

Recently uploaded (20)

ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
internship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerinternship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developer
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptx
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptx
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 

Regression analysis

  • 1. St. Xavier’s College of Management and Technology
  • 2. REGRESSION ANALYSIS A PRESENTATION BY SIMRAN JEET KAUR SONAL PRABHAT SRIKANT PANDEY SUPRIYA ANMOL
  • 3. INTRODUCTION • According to Oxford dictionary the word ‘regression’ means ‘stepping back’ or ‘returning to average value’. • The term was first used by famous biological scientist Sir Francis Galton relating to a study of hereditary characteristics. • Regression Analysis is a statistical process for estimating the relationship among variables, so that one may be able to predict the unknown value of one variable for a known value of another variable.
  • 4. CONTENTS • What is Regression Analysis • Utility of Regression • Difference between Regression and Correlation • Types of Regression • Simple, Multiple, Linear, Non-linear, Partial, Total • Scatter Diagrams and Relationships • Estimation using a Regression Lines (for linear only) • The Method of Least Squares • A Solved Example • Correlation Analysis • Using Regression and Correlation Analysis: Limitations, Errors and Caveats • Conclusion
  • 5. WHAT IS REGRESSION? The process of predicting one variable from another by statistical means, using previous data (regression). Regression Line– A line fitted to a set of data points to estimate the relationship between two variables. Regression Analysis attempts to create an estimating equation for prediction of data (by using previous data).
  • 6. UTILITY OF REGRESSION 1. Regression Analysis explains the nature of relationship between two variables. 2. It is one of the most commonly used tool for business analysis. 3. It is widely useful for quality control in corporate sector. 4. It is useful for estimation of statistical curve for demand, supply, price consumption and cost.
  • 7. DIFFERENCE BETWEEN REGRESSION AND CORELATION ANALYISIS CO-RELATION 1. Correlation is a relationship between two or more variables. 2. It is the measure of degree of relationship between two variables. 3. The coefficient of correlation is relative measure. The range lies between -1 and +1. 4. It is not very much useful for further algebraic treatment 5. Correlation coefficient is independent of change in origin and scale. REGRESSION 1. It is mathematical relation showing the average relation between two variables. 2. It is the measure of nature of relationship between variables. 3. Regression coefficient is an absolute figure. Value of an dependent variable is found from an independent variable. 4. It is very much useful for further algebraic treatment. 5. Regression coefficient is independent of the change of origin but not the scale.
  • 8. USE IN BUSINESS Regression is widely used in the field of business. Businessmen are interested in predicting future production, consumption, investments, price, profit, sales etc. So the success of business depends upon the correctness of the various estimates that they are required to make. It is also used in sociological study and economical planning to find the projections of population, birth rates, death rates etc.
  • 9. TYPES OF REGRESSION ANALYSIS  Simple and Multiple Regression  Linear and Non-Linear Regression  Partial and Total Regression
  • 10. SIMPLE REGRESSION MULTIPLE REGRESSION In Simple Regression Analysis we study about only two variables at a time in which one variable is dependent and other is independent. eg. the functional relationship between income and expenditure In Multiple Regression Analysis we study about multiple variables, among which one is dependent and others are independent. eg. the study of effect of rain and irrigation on the yield of wheat is an example of multiple regression.
  • 11. LINEAR REGRESSION NON-LINEAR REGRESSION When one variable changes with another variable in a fixed ratio, it is known as linear regression. Such type of relationship is depicted on graph by straight line or first degree equation. When one variable changes with another variable in a changing ration then it is referred to as curvilinear/non-linear regression. Such type of relationship on a graph paper takes the form of a curve. This is presented by second or third degree equation.
  • 12. PARTIAL REGRESSION TOTAL REGRESSION When two or more variables are studied for functional relationship but at a time, relationship between two variables are studied and other variables are held constant. And on the other hand when we study about all variables at a time, then it is called Total Regression
  • 13. SCATTER DIAGRAM AND RELATIONS • Determine whether a relationship between two variable exists or not • Defines a very basic nature of relationship between two variables— independent (X) and dependent (Y) variable • Helps us to draw a regression line ( a line fitted between the scatter points to derive a relation between two variable) • An important note: Regression Line is attempted to be drawn such that most of the points lie on it and equal number of points fall on either side of it.
  • 14. 0 20 40 60 80 0 50 100 Graph 3 Y-Values 0 1 2 3 4 0 50 100 Graph 1 Y-Values 0 20 40 60 80 0 20 40 Graph 2 Y-Values
  • 15. Inverse Linear with more scattering Direct Curvilinear Inverse Curvilinear
  • 16. Estimation Using Regression Line 0, 3 1, 5 2, 7 0 2 4 6 8 0 2 4 Moneyspent(inK) No. of stores visited Mrs. Hudson’s Shopping Y-Values Estimating equation for a straight line: Y = a + bX where, Y = Dependent variable a = Y intercept (constant) b = slope of the line (constant) X = Independent variable Mrs. Hudson wants to determine how much money she will end up spending if she visits 5 stores. (Let b be 2) By graph: a = 3 b = 2 X = 10 Therefore: Y = a + bX Y = 3 + 2(5) Y = 13 Therefore Mrs. Hudson comes to know she will probably spend 13K when she’ll visit 5 stores.
  • 17. Finding the Slope of Straight Line 0, 3 1, 5 2, 7 0 2 4 6 8 0 2 4 Moneyspent(inK) No. of stores visited Mrs. Hudson’s Shopping Y-Values Slope of a straight line: b = 𝒀2−𝒀1 𝑿2−𝑿1 Let us try determining the slope for Mrs. Hudson’s Shopping graph. Therefore, b = 𝒀2−𝒀1 𝑿2−𝑿1 b = 7−5 2−1 b = 2 1 = 2 2 is the Slope of line
  • 18. METHOD OF LEAST SQUARE • How can we a “fit” a line mathematically if none of the points lie on the line? • We shall determine how to determine equation for a line drawn through set points. • The Estimating Line Equation: • Ŷ symbolizes the estimated points, or the points that lie on the regression line. • Let’s take an example. 𝑌 = 𝑎 + 𝑏𝑋
  • 19. DEMONSTRATION OF USE OF LEAST SQUARE METHOD 6 5 4 Error= 2 Error= -4 Error= 2 0 1 2 3 4 5 6 7 8 9 0 2 4 6 8 10 12 14 (A) Ŷ Y- Values 6 2 5 8 Error= 6 Error= -4 Error= -2 0 1 2 3 4 5 6 7 8 9 0 2 4 6 8 10 12 14 (B) Ŷ Y-Values 8 1 8 1 6
  • 20. Using Absolute Values Vector Values Summing the Errors of the Estimating Line │Y – Ŷ│ Y – Ŷ │8 – 6│ = 2 8 – 6 = 2 │1 – 5│ = 4 1 – 5 = -4 │6 – 4│ = 2 6 – 4 = 2 Total = 8 (Absl. Error) = 0 (Total Error)
  • 21. Using Absolute Values Vector Values Summing the Errors of the Estimating Line │Y – Ŷ│ Y – Ŷ │8 – 2│ = 6 8 – 2 = 6 │1 – 5│ = 4 1 – 5 = -4 │6 – 8│ = 2 6 – 8 = -2 Total = 12 (Absl. Error) = 0 (Total Error) 2 5 8 Error= 6 Error= -4 Error= -2 0 1 2 3 4 5 6 7 8 9 0 2 4 6 8 10 12 14 (B) Ŷ Y-Values 8 1 6
  • 22. (D) IS A BETTER FIT, BUT… 4 2 7 4 2 3 0 2 4 6 8 0 10 20 (C) Actual Points Est. Points Error= 4 4 2 7 5 3 4 0 2 4 6 8 0 10 20 (D) Actual Points Est. Points Linear (Est. Points) The Absolute Error │Y – Ŷ│ │4 – 4│ = 0 │7 – 3│ = 4 │2 – 2│ = 0 Total = 4 The Absolute Error │Y – Ŷ│ │5 – 4│ = 1 │7 – 4│ = 3 │3 – 2│ = 1 Total = 5 The Value of Absolute error in D is high, so only absolute error is not reliable to find accurate regression line, which is why we shall use….
  • 23. THE METHOD OF LEAST SQUARES 4 2 7 4 2 3 0 2 4 6 8 0 10 20 (C) Actual Points Est. Points Error= 4 4 2 7 5 3 4 0 2 4 6 8 0 10 20 (D) Actual Points Est. Points Linear (Est. Points) The Absolute Error (Y – Ŷ) 2 (4 – 4) 2 = 0 (7 – 3) 2 = (4) 2 = 16 (2 – 2) 2 = 0 Total = 16 The Absolute Error (Y – Ŷ) 2 (5 – 4) 2 = 1 (7 – 4) 2 = (3)2 = 9 (3 – 2) 2 = 1 Total = 11
  • 24. THE METHOD OF LEAST SQUARE What does it do? • It magnifies or penalizes the larger errors • It cancels the effect of the positive and negative values (no need for mod operator) • Because we look for the estimating line that minimizes the sum of the squares of the errors we call this the least squares method. • Now let us take a look at how to calculate the constants for the line estimated in such a way
  • 25. (Slope of the best-fitting regression line): b = ∑𝑋𝑌 − 𝑛𝑋 𝑌 ∑𝑋2− 𝑛𝑋2 where, 𝑋 = mean of values of the independent variable 𝑌 = mean of values of the dependent variable n = the number of data points X = values of the independent variable Y = values of the dependent variable
  • 26. Y- Intercept of the best-fitting regression line: a = 𝑌 − 𝑏𝑋 where, 𝑋 = mean of values of the independent variable 𝑌 = mean of values of the dependent variable b = slope from previous equation Truck No. Age of Truck (X) Repair Expense (Y) XY, so ∑XY= 78 101 5 7 35 102 3 7 21 103 3 6 18 104 1 4 4 eg. taking the Data beside: 𝑿 = 3 𝒀 = 6 n = 4 (data points)
  • 27. A SOLVED EXAMPLE USING THE LEAST SQUARE METHOD AND DEVIATION FROM THE ARITHMETIC MEAN METHOD
  • 28. The Director of the Beverly Hills Sanitation Department is interested in the relationship between the age of a garbage truck and the repairing expenses she should expect to incur. So data is gathered by analysts on 5 trucks and arranged and is presented as: Truck No. Age of Truck (X) Repair Expenses (Y) (in 100$) 101 3 6 102 2 1 103 7 8 104 4 5 105 8 9
  • 29. ARRANGE THE DATA Trucks (n=4) Age (X) Repair Expense (Y) XY X2 Y2 101 3 6 18 9 36 102 2 1 2 4 1 103 7 8 56 49 64 104 4 5 20 16 25 105 8 9 72 64 81 ∑X = 24 ∑Y = 29 ∑XY = 168 ∑X2 = 142 ∑Y2 = 207
  • 30. FIRST METHOD (1) • Find the numerical constant b (slope of line): b = ∑𝑋𝑌 − 𝑛𝑋 𝑌 ∑𝑋2− 𝑛𝑋2 • Substitute values (𝑋 = 4.8) 𝑌 = 5.8 : b = 168 −5 4.8 5.8 142−5 4.8 2 b = 168 −139.2 142−23.04 b = 28.8 26.8 b = 𝟏. 𝟎𝟕
  • 31. • Find the numerical constant b (slope of line): a = 𝑌 − 𝑏𝑋 • Substitute values (𝑋 = 4.8) 𝑌 = 5.8 : a = 5.8 − 1.07 4.8 a = 5.8 − 5.136 a = 𝟎. 𝟔𝟔
  • 32. Regression Line equation (substitute the values): 𝑌 = 𝑎 + 𝑏𝑋 𝒀 = 𝟎. 𝟔𝟔 + 𝟏. 𝟎𝟕𝑿
  • 33. ARRANGE THE DATA (*REPEATED) Trucks (n=4) Age (X) Repair Expense (Y) XY X2 Y2 101 3 6 18 9 36 102 2 1 2 4 1 103 7 8 56 49 64 104 4 5 20 16 25 105 8 9 72 64 81 ∑X = 24 ∑Y = 29 ∑XY = 168 ∑X2 = 142 ∑Y2 = 207
  • 34. SECOND METHOD (2) Normal equation to find ‘a’ is: ∑ 𝑌 = 𝑛𝑎 + 𝑏 ∑ 𝑋 Normal equation to find ‘b’ is: ∑ 𝑋𝑌 = 𝑎 ∑ 𝑋 + 𝑏 ∑ 𝑋2 Substituting values from the table, we get: 29 = 5𝑎 + 24𝑏……………(i) 168 = 24𝑎 + 142𝑏 84 = 12𝑎 + 71𝑏………….(ii) Multiplying equation (i) by 12 and (ii) by 5
  • 35. (when multiplied by 12 and 5 respectively) 348 = 60𝑎 + 288𝑏 ……………(iii) 420 = 60𝑎 + 355𝑏 ……………(iv) By solving equation (iii) and (iv), we get: 𝒂 = 𝟎. 𝟔𝟔 𝒃 = 𝟏. 𝟎𝟕 The Estimating Equation becomes: 𝒀 = 𝟎. 𝟔𝟔 + 𝟏. 𝟎𝟕𝑿
  • 37. EXTRA: CORRELATION ANALYSIS After discovering the type of relationship between two events, what may be the next step? Correlation analysis helps us to know to what degree are both the variables related. It helps us to determine the extent of reliability through some determination coefficients, coefficient of correlation etc. Since this topic is out of scope, we must not discuss it. The next steps however are correlation analysis and multiple regression.
  • 38. REGRESSION AND CORRELATION ANALYSIS: LIMITATIONS, ERRORS AND CAVEATS • Extrapolation beyond the Range of the Observed Data An estimating equation is only valid over the same range as the one from which the sample was taken initially. • Cause and Effect Regression and Correlation analysis do not state that a change in one variable causes change in another variable. • Using Past Trends to Estimate Future Trends Reappraisal of historical data must be done if we use it to determine estimating equations for present. • Finding Relationships when they do not Exist It takes knowledge and common-sense to deduce which relationships are meaningful and meaningless.
  • 39. CONCLUSION Thus we realise why Regression Analysis is so prominent in use in all kinds of fields, and not just business. Both Regression and Correlation Analysis are powerful tools for prediction of outcomes and forecasting of data, by using the previously collected samples. Regression Analysis obviously has its limitations and should be used with discretion of the user.
  • 40. END