Study on Air-Water & Water-Water Heat Exchange in a Finned ďťżTube Exchanger
Â
2.2 Logit and Probit.pptx
1. What is Regression Analysis ?
1
⢠Technique of estimating the unknown value of
dependent variable from the known value of
independent variable is called regression analysis.
Eg : The effect of a price increase upon demand, or
the effect of changes in the money supply upon the
inflation rate
2. Regression Lines
A regression line is a line that best describes the
linear relationship between the two variables.
y = a + bx
3
a
Y=a+bX
Y=a-bX
X Axis
YAxis
3. Assumptions for regression
3
ďMeasurement :
⢠All independent variables âinterval/ratio/dichotomous
⢠Dependent variable- interval/ratio
ďSpecification :
⢠Linear relationship between dependent and
independent
ďExpected value of error term : zero
4. ďHomoscedasticity
⢠Variance of error term is same/ constant
ďNormality of error
⢠Normally distributed for each set of values of
independent variable
ďAbsence of multicollinearity
4
Assumptions for regression
5. Limitations of linear regression
5
Violation of measurements
⢠Dependent variable : if it is dichotomous
eg.: Smoker and non-smoker
Adopter and non-adopter
Participating and non-participating
⢠Independent variable: if any of the IV is dichotomous
⢠Eg: male and female
6. Shall we use LPM ???...
yes butâŚ
6
⢠Non-normality of the errors Ui
â˘Hetroscedastic variances of the errors
⢠Non fulfillment of 0 < E (Yi|Xi) < 1
⢠Questionable of value of R2 as a measure
of goodness of fit
8. What is logistic regression ?
8
ďUsed to analyze relationships between a dichotomous
dependent variable and metric or dichotomous
independent variables
ďCombines the independent variables to estimate the
probability that a particular event will occur or not
ďLR is a nonlinear regression model that forces the output
(predicted values) to be either 0 or 1
It could be called a qualitative response/discrete choice
model in the terminology of economics
9. Assumptions:
independent variables
9
between the dependent and
⢠NO NEED a linear relationship
independent variables
⢠NO NEED- Homoscedasticity of independent variables
⢠The error terms need to be independent
⢠It requires quite large sample sizes
⢠Absence of perfect multicollinearity
NO NEED -normality, linearity, and homogeneity of variance for the
11. Feature of logit model:
11
⢠As P goes from 0 to 1 the logit L goes from -â to +â.
That is although probabilities lie between 0 to 1,logits are
not so bounded.
⢠L is linear in X, the probabilities themselves are not,
which is in contrast with LPM model where probabilities
increases linearly with X.
⢠If L, the logit is positive, it means that when the value of
the regressor (s) increases the odds that the regressand
equals to 1 increases and vice versa.
12. Level of measurement:
12
⢠Logistic regression analysis requires that the dependent
variable be dichotomous.
⢠Logistic regression analysis requires that the
independent variables be metric or dichotomous.
⢠If an independent variable is nominal level and not
dichotomous, the logistic regression procedure in SPSS
has a option to dummy code the variable.
⢠If an independent variable is ordinal, we will attach the
usual caution.
13. Variables in logistic regression:
13
⢠In typical logistic regression analysis there will always be
one dependent (dichotomous) and
⢠Usually set of independent variable that may be either
dichotomous or quantitative or some combination .
14. The minimum number of cases per independent
variable is 10
(Hosmer and Lemeshow , Applied Logistic Regression )
For example-
If we areusing 8 independent variables, then
minimum sample size should be = 8 x 10= 80
14
Sample size: Logit model and Probit
model
15. Logistics regression equation
15
Ln (Pi / (1-Pi)= a + b1x1 +b2x2+âŚ.+bnXn
Where,
Pi= probability of happening of event
eg: adoption of technology
(1-Pi) = probability of not happening of the event
eg: non-adoption of technology
X1, X2âŚ.Xn= independent variables
b1, b2âŚbn= regression coefficients
a= constant (intercept)
16. Example :
16
Dependent variable Adoption / Non-adoption
Independent variables Description Hypothesized relation
Age Chronological years of
farmers
-
Education No of years of formal
schooling
+
Land holding Farm size measured acres +
Access to training Yes=1 / no=0 +
Distance to market In kilometers -
Access to credit Yes=1 / no=0 +
Extension services Yes=1 / no=0 +
18. Case 1:
A Logit Analysis of Bt Cotton Adoption and Assessment
of Farmersâ Training Need
Padaria, et al., 202049
19. ContdâŚ
Padaria, et al.,
2009
B = regression coefficient
degrees of freedom for the Wald chi- square test,
Are the standard errors associated with the coefficients
Wald chi square value and 2tailed p value used in testing the null hypothesis that the
coefficient (parameter) is 0
Used to predict whether or not an independent variable would be significant in the
model.
Exp(B) the exponentiation of the B coefficient, which is an odds ratio. This value is
given by default because odds ratios can be easier to interpret than the coefficient
25
20. Advantages of logit model:
20
ďTransformation of a dependent dichotomous
dependent variable into continuous variable
ďResults - easily interpretable
ďsimple to analyse method.
ďIt gives parameter estimates- asymptotically
consistent, efficient and normal, so that the analogue
by the regression t-test can be applied.
21. Limitation:
21
⢠As in case of logit probility model, the disturbance term in
logit model hetroscedasticity and therefore we should go
for weighted least squares.
⢠As in many other regression , there may be problem of
multicollinearity if the explanatory variable are related
among themselves
22. Application of logit model:
22
1.It can be used to identify the factors that affects the adoption of
particular technology say, use of new crop varities, fertilizers,
pesticides etc on the farm.
2.Model used extensively to analyzing growth phenomena such as
population, GNP, money supply etc.
3.In field of marketing it can be used for brand preferences and brand
loyalty for a brand
4.Gender studies can be used logit analysis to find out factors which
affect the decision making status of men and women in family
23. Probit regression model:
23
⢠Probit model is a type of regression where the dependent
variable can only take two values, for example adoption or
non-adoption, married or not married.
⢠The purpose of the model is to estimate the probability
⢠Estimating model that emerge from normal cumulative
distribution function (CDF) is popularly known as probit
model
⢠Sometimes it is also called as normit model.
24. Probit :Level of measurement
requirements
24
⢠Dependent variable = dichotomous/categorical
⢠Eg: adoption and non adoption,
participation and non- participation
⢠Independent variables be metric or dichotomous
⢠Eg: age-ratio level data
⢠Gender- male/female(dichotomous)
25. Case 2 : Factors Affecting Adoption of Improved Rice Varieties
among Rural Farm Households in Central Nepal
Ghimire (2015 ) (Published in : Rice Science)
25
27. Difference b/w
Logit and Probit model:
27
Logit Probit
Slightly flatter tails The conditional probability Pi
approaches 0 or 1 at a faster rate
Basis of logit model is standard
logistic distribution
Basis of probit model is standard
normal distribution
Variance = Î 2 / 3 Variance = 1
Simple mathematics Sophisticated mathematics
Both give same result, preference of the method depends on the researcher choice
but logit regression is mostly preferered
28. Significance of Wald test
28
â˘To test Statistical significance of unique
contribution of each coefficient in the
model
â˘This test is similar to the t test in the
multiple regression
29. Ordinal logit & probit model
⢠In both the cases -
⢠when the outcome is more than 2 and are ordinal in nature
⢠The dependent variables :
⢠Eg1: Likert type scale : strongly agree , somewhat agree, strongly
disagree
⢠Eg2: less than high school (0), high school(1), college (2), post
graduate (3)
⢠The independent variables remain same as in logit and probit
model
29
30. Multi nominal logit and multi
nominal probit
⢠When the dependent variable is not ordinal nature &
the categories of dependent variables are more than 2.
â˘E.g. 1: adoption of different adaptation strategies
Dependent variables =choice of transportation to work
Eg2: occupation classification : unskilled, semiskilled, highly
skilled
30
31. Multi nominal logit model
Dependent variable : compost , conservation tillage, both
we have three categories i.e. > 2 categories
Kassie et al. 3
2
7008
32. References :
32
⢠Meyers,L.S ., Gamst , G., & Guarino , A.J
(2006).Applied Multivariate Research : Design And
Interpretation
⢠Padaria et.al (2009). A Logit Analysis Of Bt Cotton
Adoption And Assessment of Farmerâs Training Need.
Indian Res.J.Ext.Edu.9(2)
⢠Damodar et al . (2012). Basic econometrics. Mcgraw
Hill Education , India