SlideShare a Scribd company logo
1 of 20
Statistical Analysis
Introduction
Statistical analysis is a component of data analytics. It involves collecting,
summarizing and interpreting of every data sample. A sample, in Statistics, is
a representative selection drawn from a total population.
Objectives
The objective of the analysis is
 To check whether there is a significant impact of area and
consumption of fertilizer on production of crops.
 If so then how much production will increase by increase in area and
consumption of fertilizer.
Software
The SPSS software is used for statistical analysis.
Methodology
A multiple regression technique was applied on the data of production
and , area of crops and consumption of fertilizers. A Multiple linear
regression attempts to model the relationship between two or more
explanatory variables and a response variable. Every value of the
independent variable x is associated with a value of the dependent
variable y. This technique consists of a set of following assumptions:
 Assumption of linearity.
 Assumption of normality.
 Assumption of multicollinearity.
 Assumption of homoscedasticity.
 Assumption of autocorrelation.
Assumption of linearity
The assumption of linearity states that the multiple regression model is linear in
parameters, that is the values of regressors are fixed for the repeated sampling and that
there is sufficient variability in the values of regressors.
Interpretation
Both the scatter plots plotted against residuals and independent variables are showing a random
pattern, so we conclude that there is a linear relationship between production, area and
consumption of fertilizer.
Assumption of normality
 The assumption of normality says that the stochastic (disturbance) term ei is normally distributed. In
order to check whether our residuals are normal or not different methods are used.
 Normal Q-Q-plots are made to verify the normality of residuals. Also Kolmogorov Smirnov test and
Shapiro Wilk test is applied.
Interpretation
After viewing both Q-Q plots we can say that data points are closed to diagonal line, which indicates that
the residuals are normally distributed. Also from table we can see that both standardized and
unstandardized residuals are significant that is the p-value is greater than 0.05, so we conclude that the
residuals are normally distributed.
Kolmogorov-Smirnov Shapiro-Wilk
Statistic Df Sig. Statistic Df Sig.
Unstandardized
Residual
.133 15 .200 .971 15 .876
Standardized
Residual
.133 15 .200 .971 15 .876
Assumption of multi-collinearity
 Multi-collinearity in regression occurs when predictor variables
(independent variables) in the regression model are more highly
correlated with other predictor variables than with the dependent
variable. Good regression model should not have correlation between
the independent variables or should not have multi-collinearity.
 Multicollinearity can be assessed by examining tolerance and the
Variance Inflation Factor (VIF).
Interpretation
From table we can observe that value of VIF lies between 1 – 10 that is 1<1.754<10. So we
conclude that there is no multicollinearity in our data.
Model
Collinearity Statistics
Tolerance VIF
Area .570 1.754
Fertilizer .570 1.754
Assumption of homoscedasticity
 One of the important assumptions of the linear regression model is that the variance of
each disturbance term ei, conditional on the chosen values of the explanatory variables,
is some constant number equal to σ2. This is the assumption of homoscedasticity, or
equal (homo) spread (scedasticity), that is, equal variance.
Symbolically,
E ( e2
i ) = σ2 i =1,2, ...,n
 I have used glejser test and scatter plot to test whether data is homoscedastic or not.
Interpretation
From table we can observe that significance value of glejser test for both area and consumption of fertilizer are
0.806 and 0.933 respectively are greater than α=0.05 which indicates there is no problem of heteroscedasticity.
Also the scatter plot is showing randomness which is also indicating that variances are equal and there is no
problem of heteroscedasticity.
Assumption of auto-correlation
 The term autocorrelation may be defined as “correlation between members of series of
observations ordered in time [as in time series data] or space [as in cross-sectional data].In
the regression context, the linear regression model assumes that such autocorrelation does
not exist in the disturbances ei.
 The problem of autocorrelation can be detected by using The Runs test or Durbin Watson
test.
 I have used Durbin Watson d test to detect if there is problem of autocorrelation.
Interpretation
In this analysis i used the hypothesis H0: ρ=0 versus H1:ρ≠0. Reject H0 at 2α level if d < dU or (4−d) <
dU, that is, there is statistically significant evidence of autocorrelation, positive or negative.
dl and du at 2α=0.1 is 0.700 and 1.252 respectively.
From table we can see that d= 2.732 is greater than du =1.252 also 4-d=1.268 is greater than du so from
above decision making rule we conclude that there is no autocorrelation.
Model Durbin-Watson
1 2.732
Model fitting
Model
Unstandardized
Coefficients
Standardized
Coefficients
t Sig.B Std. Error Beta
1 (Constant) -32782.283 7282.849 -4.501 .001
fertilizer 4.257 .606 .595 7.025 .000
area 3.724 .663 .476 5.619 .000
Fitted Model
Y = -32782.283+ 3.724*X1 + 4.257*X2
Production = -32782.283+ 3.724*area+ 4.257*consumption of fertilizer
Interpretation
The p-value for both β1 and β2 is 0.000 which is much less than 0.05. This low (<0.05) p-value indicates that we can
reject null hypothesis of insignificance. In other words a predictor that has a low p-value is likely to be a meaningful
addition to our model because changes in the predictor's value are related to changes in the response variable.
The above fitted model shows that β1=3.724 which indicates that production will increase by 3.724 (000 tons) for
every additional (000 hectare) in area keeping the effect of fertilizer constant. Also β2 = 4.257 indicates that for
every additional (000 nutrient/ton) of consumption of fertilizer, production will increase by an average of 4.257 (000
tons) while the effect of area is constant.
R-Squared
 R-squared is a statistical measure of how close the data are to the fitted regression line. It is also
known as the coefficient of determination, or the coefficient of multiple determination for multiple
regression.
 The definition of R-squared is fairly straight-forward; it is the percentage of the response variable
variation that is explained by a linear model.
Model R R Square
Adjusted
R Square
Std. Error of the
Estimate
1 .975 .951 .943 916.5336
The tabulated value of R2 =0.951 that is 95.1% which is very closed to 100. It shows that model explains
95.1% variability of the response variable around the mean and the model better fits our data.
Interpretation
ANOVA for Regression
 Analysis of Variance (ANOVA) consists of calculations that provide information about levels of variability
within a regression model and form a basis for tests of significance. The basic regression line concept, DATA =
FIT + RESIDUAL.
Model
Sum of
Squares Df
Mean
Square F Sig.
1 Regression 1.955E8 2 9.777E7 116.391 .000
Residual 1.008E7 12 840033.786
Total 2.056E8 14
Interpretation
The significance value of regression in table is 0.000 that is less than 0.05 indicating that the model run is
statistically significant.
Conclusion
 The model is significant having F= 116.391 with p-value = 0.000 at 5% level of significance. This indicates that
multiple regression model of production of crops, area and consumption of fertilizer is significant.
 Both the regression coefficients are having p-value = 0.000 which is also significant at 5% level of significance. When
area is increased by one unit (000 hectare) production will be increased by 3.724 (000 tons), keeping the effect of fertilizer
constant. And when consumption of fertilizer is increased by one unit (000 nutrient/ton) the production will be increased
by 4.257 (000 tons), keeping the effect of area constant.
 The value of R2 is 0.951 means that 95.1% of variation in production of crops is explained by its linear relationship
with area and consumption of fertilizer and only 4.9% of variation is explained by other variables which are not included
in the model.
 So we conclude that there is a significant effect of area and consumption of fertilizer on production of crops.
Thank You

More Related Content

What's hot

P G STAT 531 Lecture 10 Regression
P G STAT 531 Lecture 10 RegressionP G STAT 531 Lecture 10 Regression
P G STAT 531 Lecture 10 RegressionAashish Patel
 
Health probabilities &amp; estimation of parameters
Health probabilities &amp; estimation of parameters Health probabilities &amp; estimation of parameters
Health probabilities &amp; estimation of parameters KwambokaLeonidah
 
Lesson 6 coefficient of determination
Lesson 6   coefficient of determinationLesson 6   coefficient of determination
Lesson 6 coefficient of determinationMehediHasan1023
 
Research Methodology anova
  Research Methodology anova  Research Methodology anova
Research Methodology anovaPraveen Minz
 
Hypothesis testing for nonparametric data
Hypothesis testing for nonparametric data Hypothesis testing for nonparametric data
Hypothesis testing for nonparametric data KwambokaLeonidah
 
Diagnostic methods for Building the regression model
Diagnostic methods for Building the regression modelDiagnostic methods for Building the regression model
Diagnostic methods for Building the regression modelMehdi Shayegani
 
Lesson 8 Linear Correlation And Regression
Lesson 8 Linear Correlation And RegressionLesson 8 Linear Correlation And Regression
Lesson 8 Linear Correlation And RegressionSumit Prajapati
 
Multiple linear regression II
Multiple linear regression IIMultiple linear regression II
Multiple linear regression IIJames Neill
 
Regression and corelation (Biostatistics)
Regression and corelation (Biostatistics)Regression and corelation (Biostatistics)
Regression and corelation (Biostatistics)Muhammadasif909
 
Quality Engineering material
Quality Engineering materialQuality Engineering material
Quality Engineering materialTeluguSudhakar3
 
Introduction to correlation and regression analysis
Introduction to correlation and regression analysisIntroduction to correlation and regression analysis
Introduction to correlation and regression analysisFarzad Javidanrad
 

What's hot (20)

P G STAT 531 Lecture 10 Regression
P G STAT 531 Lecture 10 RegressionP G STAT 531 Lecture 10 Regression
P G STAT 531 Lecture 10 Regression
 
Health probabilities &amp; estimation of parameters
Health probabilities &amp; estimation of parameters Health probabilities &amp; estimation of parameters
Health probabilities &amp; estimation of parameters
 
Lesson 6 coefficient of determination
Lesson 6   coefficient of determinationLesson 6   coefficient of determination
Lesson 6 coefficient of determination
 
Estimation
EstimationEstimation
Estimation
 
Logistics regression
Logistics regressionLogistics regression
Logistics regression
 
Research Methodology anova
  Research Methodology anova  Research Methodology anova
Research Methodology anova
 
Chapter14
Chapter14Chapter14
Chapter14
 
Ch14 multiple regression
Ch14 multiple regressionCh14 multiple regression
Ch14 multiple regression
 
Hypothesis testing for nonparametric data
Hypothesis testing for nonparametric data Hypothesis testing for nonparametric data
Hypothesis testing for nonparametric data
 
Chapter13
Chapter13Chapter13
Chapter13
 
Chapter11
Chapter11Chapter11
Chapter11
 
ESTIMATING R 2 SHRINKAGE IN REGRESSION
ESTIMATING R 2 SHRINKAGE IN REGRESSIONESTIMATING R 2 SHRINKAGE IN REGRESSION
ESTIMATING R 2 SHRINKAGE IN REGRESSION
 
Diagnostic methods for Building the regression model
Diagnostic methods for Building the regression modelDiagnostic methods for Building the regression model
Diagnostic methods for Building the regression model
 
Lesson 8 Linear Correlation And Regression
Lesson 8 Linear Correlation And RegressionLesson 8 Linear Correlation And Regression
Lesson 8 Linear Correlation And Regression
 
Multiple linear regression II
Multiple linear regression IIMultiple linear regression II
Multiple linear regression II
 
Chi sqyre test
Chi sqyre testChi sqyre test
Chi sqyre test
 
Regression and corelation (Biostatistics)
Regression and corelation (Biostatistics)Regression and corelation (Biostatistics)
Regression and corelation (Biostatistics)
 
Chi Square & Anova
Chi Square & AnovaChi Square & Anova
Chi Square & Anova
 
Quality Engineering material
Quality Engineering materialQuality Engineering material
Quality Engineering material
 
Introduction to correlation and regression analysis
Introduction to correlation and regression analysisIntroduction to correlation and regression analysis
Introduction to correlation and regression analysis
 

Similar to multiple Regression

Chapter 4(1).pptx
Chapter 4(1).pptxChapter 4(1).pptx
Chapter 4(1).pptxmahamoh6
 
Recep maz msb 701 quantitative analysis for managers
Recep maz msb 701 quantitative analysis for managersRecep maz msb 701 quantitative analysis for managers
Recep maz msb 701 quantitative analysis for managersrecepmaz
 
Recep maz msb 701 quantitative analysis for managers
Recep maz msb 701 quantitative analysis for managersRecep maz msb 701 quantitative analysis for managers
Recep maz msb 701 quantitative analysis for managersrecepmaz
 
Multivariate reg analysis
Multivariate reg analysisMultivariate reg analysis
Multivariate reg analysisIrfan Hussain
 
X18125514 ca2-statisticsfor dataanalytics
X18125514 ca2-statisticsfor dataanalyticsX18125514 ca2-statisticsfor dataanalytics
X18125514 ca2-statisticsfor dataanalyticsShantanu Deshpande
 
Lesson 16 Data Analysis Ii
Lesson 16 Data Analysis IiLesson 16 Data Analysis Ii
Lesson 16 Data Analysis Iivinod
 
A Topic on REGRESSION Analysis conducted pptx
A Topic on REGRESSION Analysis conducted pptxA Topic on REGRESSION Analysis conducted pptx
A Topic on REGRESSION Analysis conducted pptxzeusrex4815162342
 
Machine learning session4(linear regression)
Machine learning   session4(linear regression)Machine learning   session4(linear regression)
Machine learning session4(linear regression)Abhimanyu Dwivedi
 
Multinomial Logistic Regression.pdf
Multinomial Logistic Regression.pdfMultinomial Logistic Regression.pdf
Multinomial Logistic Regression.pdfAlemAyahu
 
Chapter 6
Chapter 6Chapter 6
Chapter 6ECRD IN
 
Basic statistics concepts
Basic statistics conceptsBasic statistics concepts
Basic statistics conceptsECRD2015
 
Basic Statistics Concepts
Basic Statistics ConceptsBasic Statistics Concepts
Basic Statistics ConceptsECRD IN
 

Similar to multiple Regression (20)

OLS chapter
OLS chapterOLS chapter
OLS chapter
 
Chapter 4(1).pptx
Chapter 4(1).pptxChapter 4(1).pptx
Chapter 4(1).pptx
 
Recep maz msb 701 quantitative analysis for managers
Recep maz msb 701 quantitative analysis for managersRecep maz msb 701 quantitative analysis for managers
Recep maz msb 701 quantitative analysis for managers
 
Recep maz msb 701 quantitative analysis for managers
Recep maz msb 701 quantitative analysis for managersRecep maz msb 701 quantitative analysis for managers
Recep maz msb 701 quantitative analysis for managers
 
Simple Regression.pptx
Simple Regression.pptxSimple Regression.pptx
Simple Regression.pptx
 
Modelo Generalizado
Modelo GeneralizadoModelo Generalizado
Modelo Generalizado
 
Multiple regression
Multiple regressionMultiple regression
Multiple regression
 
Multivariate reg analysis
Multivariate reg analysisMultivariate reg analysis
Multivariate reg analysis
 
X18125514 ca2-statisticsfor dataanalytics
X18125514 ca2-statisticsfor dataanalyticsX18125514 ca2-statisticsfor dataanalytics
X18125514 ca2-statisticsfor dataanalytics
 
Critical Care.pptx
Critical Care.pptxCritical Care.pptx
Critical Care.pptx
 
Lesson 16 Data Analysis Ii
Lesson 16 Data Analysis IiLesson 16 Data Analysis Ii
Lesson 16 Data Analysis Ii
 
A Topic on REGRESSION Analysis conducted pptx
A Topic on REGRESSION Analysis conducted pptxA Topic on REGRESSION Analysis conducted pptx
A Topic on REGRESSION Analysis conducted pptx
 
Machine learning session4(linear regression)
Machine learning   session4(linear regression)Machine learning   session4(linear regression)
Machine learning session4(linear regression)
 
Chapter10 Revised
Chapter10 RevisedChapter10 Revised
Chapter10 Revised
 
Chapter10 Revised
Chapter10 RevisedChapter10 Revised
Chapter10 Revised
 
Chapter10 Revised
Chapter10 RevisedChapter10 Revised
Chapter10 Revised
 
Multinomial Logistic Regression.pdf
Multinomial Logistic Regression.pdfMultinomial Logistic Regression.pdf
Multinomial Logistic Regression.pdf
 
Chapter 6
Chapter 6Chapter 6
Chapter 6
 
Basic statistics concepts
Basic statistics conceptsBasic statistics concepts
Basic statistics concepts
 
Basic Statistics Concepts
Basic Statistics ConceptsBasic Statistics Concepts
Basic Statistics Concepts
 

Recently uploaded

A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991RKavithamani
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...RKavithamani
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 

Recently uploaded (20)

A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 

multiple Regression

  • 1.
  • 2. Statistical Analysis Introduction Statistical analysis is a component of data analytics. It involves collecting, summarizing and interpreting of every data sample. A sample, in Statistics, is a representative selection drawn from a total population.
  • 3. Objectives The objective of the analysis is  To check whether there is a significant impact of area and consumption of fertilizer on production of crops.  If so then how much production will increase by increase in area and consumption of fertilizer. Software The SPSS software is used for statistical analysis.
  • 4. Methodology A multiple regression technique was applied on the data of production and , area of crops and consumption of fertilizers. A Multiple linear regression attempts to model the relationship between two or more explanatory variables and a response variable. Every value of the independent variable x is associated with a value of the dependent variable y. This technique consists of a set of following assumptions:  Assumption of linearity.  Assumption of normality.  Assumption of multicollinearity.  Assumption of homoscedasticity.  Assumption of autocorrelation.
  • 5. Assumption of linearity The assumption of linearity states that the multiple regression model is linear in parameters, that is the values of regressors are fixed for the repeated sampling and that there is sufficient variability in the values of regressors.
  • 6. Interpretation Both the scatter plots plotted against residuals and independent variables are showing a random pattern, so we conclude that there is a linear relationship between production, area and consumption of fertilizer.
  • 7. Assumption of normality  The assumption of normality says that the stochastic (disturbance) term ei is normally distributed. In order to check whether our residuals are normal or not different methods are used.  Normal Q-Q-plots are made to verify the normality of residuals. Also Kolmogorov Smirnov test and Shapiro Wilk test is applied.
  • 8. Interpretation After viewing both Q-Q plots we can say that data points are closed to diagonal line, which indicates that the residuals are normally distributed. Also from table we can see that both standardized and unstandardized residuals are significant that is the p-value is greater than 0.05, so we conclude that the residuals are normally distributed. Kolmogorov-Smirnov Shapiro-Wilk Statistic Df Sig. Statistic Df Sig. Unstandardized Residual .133 15 .200 .971 15 .876 Standardized Residual .133 15 .200 .971 15 .876
  • 9. Assumption of multi-collinearity  Multi-collinearity in regression occurs when predictor variables (independent variables) in the regression model are more highly correlated with other predictor variables than with the dependent variable. Good regression model should not have correlation between the independent variables or should not have multi-collinearity.  Multicollinearity can be assessed by examining tolerance and the Variance Inflation Factor (VIF).
  • 10. Interpretation From table we can observe that value of VIF lies between 1 – 10 that is 1<1.754<10. So we conclude that there is no multicollinearity in our data. Model Collinearity Statistics Tolerance VIF Area .570 1.754 Fertilizer .570 1.754
  • 11. Assumption of homoscedasticity  One of the important assumptions of the linear regression model is that the variance of each disturbance term ei, conditional on the chosen values of the explanatory variables, is some constant number equal to σ2. This is the assumption of homoscedasticity, or equal (homo) spread (scedasticity), that is, equal variance. Symbolically, E ( e2 i ) = σ2 i =1,2, ...,n  I have used glejser test and scatter plot to test whether data is homoscedastic or not.
  • 12. Interpretation From table we can observe that significance value of glejser test for both area and consumption of fertilizer are 0.806 and 0.933 respectively are greater than α=0.05 which indicates there is no problem of heteroscedasticity. Also the scatter plot is showing randomness which is also indicating that variances are equal and there is no problem of heteroscedasticity.
  • 13. Assumption of auto-correlation  The term autocorrelation may be defined as “correlation between members of series of observations ordered in time [as in time series data] or space [as in cross-sectional data].In the regression context, the linear regression model assumes that such autocorrelation does not exist in the disturbances ei.  The problem of autocorrelation can be detected by using The Runs test or Durbin Watson test.  I have used Durbin Watson d test to detect if there is problem of autocorrelation.
  • 14. Interpretation In this analysis i used the hypothesis H0: ρ=0 versus H1:ρ≠0. Reject H0 at 2α level if d < dU or (4−d) < dU, that is, there is statistically significant evidence of autocorrelation, positive or negative. dl and du at 2α=0.1 is 0.700 and 1.252 respectively. From table we can see that d= 2.732 is greater than du =1.252 also 4-d=1.268 is greater than du so from above decision making rule we conclude that there is no autocorrelation. Model Durbin-Watson 1 2.732
  • 15. Model fitting Model Unstandardized Coefficients Standardized Coefficients t Sig.B Std. Error Beta 1 (Constant) -32782.283 7282.849 -4.501 .001 fertilizer 4.257 .606 .595 7.025 .000 area 3.724 .663 .476 5.619 .000
  • 16. Fitted Model Y = -32782.283+ 3.724*X1 + 4.257*X2 Production = -32782.283+ 3.724*area+ 4.257*consumption of fertilizer Interpretation The p-value for both β1 and β2 is 0.000 which is much less than 0.05. This low (<0.05) p-value indicates that we can reject null hypothesis of insignificance. In other words a predictor that has a low p-value is likely to be a meaningful addition to our model because changes in the predictor's value are related to changes in the response variable. The above fitted model shows that β1=3.724 which indicates that production will increase by 3.724 (000 tons) for every additional (000 hectare) in area keeping the effect of fertilizer constant. Also β2 = 4.257 indicates that for every additional (000 nutrient/ton) of consumption of fertilizer, production will increase by an average of 4.257 (000 tons) while the effect of area is constant.
  • 17. R-Squared  R-squared is a statistical measure of how close the data are to the fitted regression line. It is also known as the coefficient of determination, or the coefficient of multiple determination for multiple regression.  The definition of R-squared is fairly straight-forward; it is the percentage of the response variable variation that is explained by a linear model. Model R R Square Adjusted R Square Std. Error of the Estimate 1 .975 .951 .943 916.5336 The tabulated value of R2 =0.951 that is 95.1% which is very closed to 100. It shows that model explains 95.1% variability of the response variable around the mean and the model better fits our data. Interpretation
  • 18. ANOVA for Regression  Analysis of Variance (ANOVA) consists of calculations that provide information about levels of variability within a regression model and form a basis for tests of significance. The basic regression line concept, DATA = FIT + RESIDUAL. Model Sum of Squares Df Mean Square F Sig. 1 Regression 1.955E8 2 9.777E7 116.391 .000 Residual 1.008E7 12 840033.786 Total 2.056E8 14 Interpretation The significance value of regression in table is 0.000 that is less than 0.05 indicating that the model run is statistically significant.
  • 19. Conclusion  The model is significant having F= 116.391 with p-value = 0.000 at 5% level of significance. This indicates that multiple regression model of production of crops, area and consumption of fertilizer is significant.  Both the regression coefficients are having p-value = 0.000 which is also significant at 5% level of significance. When area is increased by one unit (000 hectare) production will be increased by 3.724 (000 tons), keeping the effect of fertilizer constant. And when consumption of fertilizer is increased by one unit (000 nutrient/ton) the production will be increased by 4.257 (000 tons), keeping the effect of area constant.  The value of R2 is 0.951 means that 95.1% of variation in production of crops is explained by its linear relationship with area and consumption of fertilizer and only 4.9% of variation is explained by other variables which are not included in the model.  So we conclude that there is a significant effect of area and consumption of fertilizer on production of crops.