Artificial Intelligence in Philippine Local Governance: Challenges and Opport...
Categorical dependent variables: Application using WVS data from selected Arab countries
1. Categorical dependent variables:
Application using WVS data from selected Arab countries
Irina Vartanova
Institute for Futures Studies, Stockholm
ERF Workshop – May 11, 2015
1 / 25
2. Example: Income and Happiness
• Positive but diminishing association between income and
happiness (see Clark, et al., 2007 for a review)
• The association can be partially explained by reverse
causation and by unobserved individual characteristics, such
as personality traits.
• Relative income is more important than actual income, besides
comparisons of relative position are made across nations.
2 / 25
3. Variables
Taking all things together, would you say you are
• Very happy
• Rather happy
• Not very happy
• Not at all happy
1
0
On this card is an income scale on which 1 indicates the lowest
income group and 10 the highest income group in your country.
We would like to know in what group your household is.
1 - Lowest group 10 - Highest group
3 / 25
4. Data
• WVS, 6th wave
• 12 MENA countries: Algeria, Egypt, Iraq, Jordan, Kuwait,
Lebanon, Libya, Morocco, Palestine, Qatar, Yemen
• Bahrain excluded
• Pool sample: 9928 after listwize deletion of missing cases
4 / 25
10. (Relatively) full model of happiness
Based on the extensive review of existing factors of subjective
well-being (Dolan et. all, 2008), we control for:
• Gender - women tend to report higher happiness.
• Age squared - younger and older generations are happier.
• Marital status, being married is associated with the highest
happiness and being divorced with the lowest.
• Having children, the effect is mixed. Positive effect on life
satisfaction, but not on happiness. Negative consequences of
additional children, also culturally dependant.
• Health.
• Education has positive effect, especially in low income
countries.
• Unemployment is detrimental for happiness especially among
men.
• Religiosity have positive effect on happiness.
• General trust positively associated with happiness.
10 / 25
11. (Relatively) full model of happiness - 2
Happiness
s.income 0.198∗∗∗
(0.013)
female 0.307∗∗∗
(0.054)
poly(age, 2)1 5.221 (3.915)
poly(age, 2)2 21.006∗∗∗
(3.252)
education Middle −0.057 (0.060)
education High −0.055 (0.081)
marital.st Divorced −0.661∗∗∗
(0.148)
marital.st Widowed −0.396∗∗∗
(0.125)
marital.st Single −0.366∗∗∗
(0.109)
children 1 child 0.278∗∗
(0.125)
children 2 or more 0.113 (0.100)
to be continued
11 / 25
12. (Relatively) full model of happiness - 3
s.health Good −0.649∗∗∗
(0.069)
s.health Fair −1.765∗∗∗
(0.076)
s.health Poor −2.767∗∗∗
(0.112)
imp.religion Rather important −0.474∗∗∗
(0.089)
imp.religion Not very important −0.895∗∗∗
(0.166)
imp.religion Not at all important −1.164∗∗∗
(0.214)
general.trust 0.325∗∗∗
(0.065)
unemployed −0.494∗∗∗
(0.103)
Female:unemployed 0.158 (0.175)
Constant 1.712∗∗∗
(0.155)
N 13750
Log Likelihood −5302.329
12 / 25
14. Multiplicity Correction
• Since we test 66
hypothesis
simultaneously,
around 3 of them
could be significant
by chance
• There are several
ways to correct for
multiple testing. Here
I use the Holm
correction which sets
the α for the entire
set of tests equal to
α
n .
Palestine
Iraq
Jordan
Kuwait
Lebanon
Libya
Morocco
Qatar
Tunisia
Egypt
Yemen
Egypt
Tunisia
Qatar
Morocco
Libya
Lebanon
Kuwait
Jordan
Iraq
Palestine
Algeria 0.57
0.12
0.93
0.11
0.06
0.12
−0.60
0.16
0.04
0.13
−0.07
0.12
0.26
0.12
−1.66
0.24
0.14
0.12
2.85
0.11
0.54
0.12
0.36
0.11
−0.51
0.12
−1.16
0.16
−0.53
0.13
−0.64
0.12
−0.31
0.12
−2.23
0.24
−0.42
0.12
2.28
0.11
−0.02
0.12
−0.87
0.12
−1.52
0.15
−0.89
0.12
−1.00
0.11
−0.67
0.11
−2.59
0.24
−0.78
0.11
1.92
0.10
−0.39
0.11
−0.66
0.16
−0.02
0.13
−0.13
0.12
0.20
0.12
−1.72
0.25
0.08
0.12
2.79
0.11
0.48
0.13
0.63
0.16
0.53
0.15
0.85
0.16
−1.07
0.26
0.74
0.16
3.44
0.15
1.14
0.16
−0.11
0.13
0.22
0.13
−1.70
0.25
0.11
0.13
2.81
0.12
0.51
0.14
0.33
0.12
−1.59
0.24
0.21
0.11
2.92
0.10
0.61
0.12
−1.92
0.24
−0.11
0.12
2.59
0.11
0.29
0.12
1.81
0.24
4.51
0.24
2.21
0.25
2.70
0.11
0.40
0.12
−2.30
0.11
Significantly < 0
Not Significant
Significantly > 0
bold = brow − bcol
ital = SE(brow − bcol)
14 / 25
15. Interpretation: Odds Ratios
• Odds ratios describe the factor by which odds change as one
variable changes holding other constant. They are easily
calculated:
ebs.income
= 1.22
• Interpretation: For all of the people who live in the same
MENA country, have the same gender, age, education and so
on and with 5 score on subjective income - for every 100 of
them we would expect to be happy, we would expect 122 of
the same types of people, but who had 6 score on subjective
income, be happy.
• Odds ratios remain only relative comparisons due to
unobserved heterogeneity (Mood, 2010).
15 / 25
16. Alternatives: Types of Marginal Effects
• Average Marginal Effects - takes the average of the marginal
effect across all cases used to estimate the model.
• Marginal Effect at the Mean - takes the marginal effect of
each variable holding all other variables constant at their
mean values.
• Marginal Effect at Representative values - take the marginal
effect of each variable holding all others constant at
substantively/theoretically interesting values
16 / 25
18. Model Fit, Evaluation and Comparison
• Specification tests: Wald test, Likelihood ratio test
• Pseudo-R2
• Information criteria: AIC, BIC
18 / 25
19. Testing: Wald and Likelihood-Ratio Test
• A Wald test is base on the assumption that B N(β, V (B))
and tests whether β = 0
• A likelihood-ratio test compares two models, a full one (MF )
with coefficients BF and a nested model (MR) which places q
linear restrictions on the coefficients in BF :
LLR = −2(LL(MR) − (LL(MF ) χ2
q
19 / 25
20. Hypothesis Test for Subjective Income
Wald test
Res.Df Df χ2 Pr(> χ2)
1 13718
2 13718 1 229.48 < 2.2e − 16 ∗ ∗∗
LR test
Df LogLik Df χ2 Pr(> χ2)
1 32 -5302.3
2 31 -5420.2 -1 235.69 < 2.2e − 16 ∗ ∗∗
20 / 25
21. Pseudo-R2
• Pseudo-R2 rely on analogous to the linear model, but none of
them can be interpreted as the proportion of variation in the
dependent variable explained by the independent variables.
• Many different types, each of them produce different results
21 / 25
22. Several Pseudo-R2
for the Happiness Model
OLS R2
= 1 − SSr esidual
SSt otal
Efron’s R2
= 1 − N (yi −πi )2
N (yi −y)2 .308
McFadden’s R2
= 1 − logL(MFull )
logL(MNull )
.291
Cox & Snell’s R2
= 1 − logL(MN ull)
logL(MF ull)
2
N
.273
Count R2
= Correct
Count
.824
22 / 25
23. AIC and BIC
Akaike’s Information Criterion
AIC = −2log(L(Θ data)) + 2K
Bayesian Information Criterion
BIC = −2log(L) + Klog(n)
• Whether to use AIC or BIC depends on how much one wants
to penalize additional model parameters.
23 / 25
24. AIC and BIC for the Happiness Model
Base model Without Gender*Unemployment
Log Likelihood −5302.329 −5302.867
AIC 10668.66 10667.73
BIC 10909.58 10901.13
24 / 25