Regression Analysis
Dr. Jai Singh, Faculty
Vasanta College, Banaras Hindu
University,
Varanasi
What is Regression Analysis?
• Regression analysis is a set of statistical
methods used for the estimation of
relationships between a dependent variable
and one or more independent variables.
• It can be utilized to assess the strength of the
relationship between variables and for
modeling the future relationship between
them.
Continue-
• Regression analysis is a way of mathematically
sorting out which of those variables does
indeed have an impact.
• It answers the questions-
- Which factors matter most?
- Which can we ignore?
- How do those factors interact with each other?
- How certain are we about all of these factors?
Regression analysis as statistical model
• Regression analysis is a set of statistical
processes for estimating the relationships
between a dependent variable (outcome or
criterion variable) and one or more explanatory
variables (predictors, covariates, or features).
Purpose of Regression analysis
• Regression analysis is widely used for prediction
and forecasting,
• Regression analysis can be used to infer causal
relationships between the independent and
dependent variables.
• Regressions by themselves only reveal
relationships between a dependent variable and a
collection of independent variables in a fixed
dataset.
Regression Analysis: Assumption
Major assumption of regression are:
• Linearity: The relationship between X and the
mean of Y is linear.
• Homoscedasticity: The variance of residual is the
same for any value of X.
• Independence: Observations are independent of
each other.
• Normality: For any fixed value of X, Y is normally
distributed.
Regression Analysis – Simple linear regression
Linear regression attempts to model the relationship between two variables by
fitting a linear equation to observed data. One variable is considered to be
an explanatory variable, and the other is considered to be a dependent variable.
The simple linear model is expressed using the following equation:
Y = a + bX + ϵ
Where:
Y – Dependent /criterion/
X – Independent (explanatory)variable
a – Intercept
b – Slope
ϵ – Residual (error)
Regression Analysis – Multiple linear regression
Multiple linear regression analysis is essentially similar to
the simple linear model, with the exception that multiple
independent variables are used in the model. The
mathematical representation of multiple linear regression
is:
Y = a + bX1 + cX2 + dX3 + ϵ
Where:
Y – Dependent variable
X1, X2, X3 – Independent (explanatory) variables
a – Intercept
b, c, d – Slopes
ϵ – Residual(error)
Image retrieved from-
https://www.ablebits.com/office-addins-blog/2018/08/01/linear-regression-
analysis-excel/
Situations for using Regression
Situation: Investigator want to find out variation
contributedin teaching behavior by Emotional
Intelligence, Teaching Experience, Training.
Here:
Criterion Variable: Teaching Behavior
Explanatory Variable: Emotional Intelligence,
Teaching Experience & Training
Situations for using Regression
• Situation- Research agency conducted study to find out why
learning outcomesof studentsshowing declining tendency in
National Achievement Survey (NAS), for this agency identified
some factors like attitude of teachers, learning environment,
SES of parents, education level of parents, activity based
teaching-learning process
Criterion Variable: Learning Outcomes
Explanatory Variable: attitude of teachers, learning environment,
SES of parents, education level of parents, activity based teaching-
learning process
Situations for using Regression
• Situation- Factors contributing to death rate of COVID-19
affected patients like co-morbid conditions (heart disease &
diabetes etc.), age, testing facilities etc.
Criterion Variable: Death Rate due to COVID-19
Explanatory Variable: co-morbid conditions (heart disease &
diabetes etc.), age, testing facilities etc.
How to formulate hypothesis for
regression analysis
• Null Hypothesis- Co-morbidity, age, early testing etc.
will not significantly explain variance in incidence of
death rate due to COVID 19.
• Alternative Hypothesis- Co-morbidity, age, early
testing etc. will explain variance in incidence of death
rate due to COVID 19.

Workshop QCI- regression_analysis

  • 1.
    Regression Analysis Dr. JaiSingh, Faculty Vasanta College, Banaras Hindu University, Varanasi
  • 2.
    What is RegressionAnalysis? • Regression analysis is a set of statistical methods used for the estimation of relationships between a dependent variable and one or more independent variables. • It can be utilized to assess the strength of the relationship between variables and for modeling the future relationship between them.
  • 3.
    Continue- • Regression analysisis a way of mathematically sorting out which of those variables does indeed have an impact. • It answers the questions- - Which factors matter most? - Which can we ignore? - How do those factors interact with each other? - How certain are we about all of these factors?
  • 4.
    Regression analysis asstatistical model • Regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (outcome or criterion variable) and one or more explanatory variables (predictors, covariates, or features).
  • 5.
    Purpose of Regressionanalysis • Regression analysis is widely used for prediction and forecasting, • Regression analysis can be used to infer causal relationships between the independent and dependent variables. • Regressions by themselves only reveal relationships between a dependent variable and a collection of independent variables in a fixed dataset.
  • 6.
    Regression Analysis: Assumption Majorassumption of regression are: • Linearity: The relationship between X and the mean of Y is linear. • Homoscedasticity: The variance of residual is the same for any value of X. • Independence: Observations are independent of each other. • Normality: For any fixed value of X, Y is normally distributed.
  • 7.
    Regression Analysis –Simple linear regression Linear regression attempts to model the relationship between two variables by fitting a linear equation to observed data. One variable is considered to be an explanatory variable, and the other is considered to be a dependent variable. The simple linear model is expressed using the following equation: Y = a + bX + ϵ Where: Y – Dependent /criterion/ X – Independent (explanatory)variable a – Intercept b – Slope ϵ – Residual (error)
  • 8.
    Regression Analysis –Multiple linear regression Multiple linear regression analysis is essentially similar to the simple linear model, with the exception that multiple independent variables are used in the model. The mathematical representation of multiple linear regression is: Y = a + bX1 + cX2 + dX3 + ϵ Where: Y – Dependent variable X1, X2, X3 – Independent (explanatory) variables a – Intercept b, c, d – Slopes ϵ – Residual(error)
  • 9.
  • 10.
    Situations for usingRegression Situation: Investigator want to find out variation contributedin teaching behavior by Emotional Intelligence, Teaching Experience, Training. Here: Criterion Variable: Teaching Behavior Explanatory Variable: Emotional Intelligence, Teaching Experience & Training
  • 11.
    Situations for usingRegression • Situation- Research agency conducted study to find out why learning outcomesof studentsshowing declining tendency in National Achievement Survey (NAS), for this agency identified some factors like attitude of teachers, learning environment, SES of parents, education level of parents, activity based teaching-learning process Criterion Variable: Learning Outcomes Explanatory Variable: attitude of teachers, learning environment, SES of parents, education level of parents, activity based teaching- learning process
  • 12.
    Situations for usingRegression • Situation- Factors contributing to death rate of COVID-19 affected patients like co-morbid conditions (heart disease & diabetes etc.), age, testing facilities etc. Criterion Variable: Death Rate due to COVID-19 Explanatory Variable: co-morbid conditions (heart disease & diabetes etc.), age, testing facilities etc.
  • 13.
    How to formulatehypothesis for regression analysis • Null Hypothesis- Co-morbidity, age, early testing etc. will not significantly explain variance in incidence of death rate due to COVID 19. • Alternative Hypothesis- Co-morbidity, age, early testing etc. will explain variance in incidence of death rate due to COVID 19.