1. Definitions
1. Regression analysis is a statistical process for estimating the
relationships among variables. It includes many techniques for modeling
and analyzing several variables, when the focus is on the relationship
between a dependent variable and one or more independent variables.
2. "Least squares" means that the overall solution minimizes the sum of
the squares of the errors made in the results of every single equation.
3. Significance Level The significance level of a statistical hypothesis test
is a fixed probability of wrongly rejecting the null hypothesis H0, if it is
in fact true.
4. Simple linear regression is the least squares estimator of a
linear regression model with a single explanatory variable.
5. Multiple linear regression attempts to model the relationship between
two or more explanatory variables and a response variable by fitting
linear equation to observed data. Every value of the independent variable
x is associated with a value of the dependent variable y.
2. Definitions
• Standard error is used to estimate confidence interval for dependent variable.
• Multicollinearity (also collinearity) is a statistical phenomenon in which two or more
predictor variables in a multiple regression model are highly correlated, meaning that one
can be linearly predicted from the others with a non-trivial degree of accuracy.
• Remedies of Multicolinearity:
1. If mild multicollinearity : do nothing
2. Drop one of the variable
3. Transform the variable
1. Combination
2. Log or square 1
3. Increase sample size
• The P value or calculated probability is the estimated probability
of rejecting the null hypothesis (H0) of a study question when that
hypothesis is true.
• Degree of Freedom: This refers to a positive whole number that indicates the
lack of restrictions in our calculations. The degree of freedom is the number
of values in a calculation that we can vary.
• R-squared is a statistical measure of how close the data are to the
fitted regression line. It is also known as the coefficient of
determination, or the coefficient of multiple determination for
multiple regression.
3. Definitions
• DEFINITION of 'Heteroskedasticity' In statistics, when the standard
deviations of a variable, monitored over a specific amount of time, are
non-constant. Heteroskedasticity often arises in two forms, conditional
and unconditional.