3. 3
QUALITY
OF
LIFE
IN
G
20
COUNTRIES
INTRODUCTION
The
aim
of
this
research
is
to
investigate
how
the
quality
of
life
in
G20
countries
is
related
to
some
indicators
of
life
quality.
Considering
quality
of
life
we
refer
to
the
general
well-‐being
of
individuals
and
societies.
The
term
is
used
in
a
wide
range
of
contexts,
including
the
fields
of
international
development,
healthcare,
and
politics.
Standard
indicators
of
the
quality
of
life
include
not
only
wealth
and
employment,
but
also
the
built
environment,
physical
and
mental
health,
education,
recreation
and
leisure
time,
and
social
belonging.
So
among
a
variety
of
indicators
we
have
chosen
8.
Life
expectancy
is
a
key
indicator
of
the
general
health
of
the
population.
Improvements
in
overall
life
expectancy
reflect
improvements
in
social
and
economic
conditions,
lifestyle,
access
to
health
services
and
medical
advances.
This
indicator
uses
estimated
life
expectancy
at
birth.
CO2
emissions
and
terrestrial
protected
areas
are
indicators
that
concern
how
natural
environment
supports
its
people,
economy
and
culture.
As
the
population
grows
and
economic
activity
increases,
more
demands
are
placed
on
the
natural
environment.
Environmental
issues
impact
on
economic
and
public
health
issues.
In
fact
another
indicator
that
we
have
chosen
is
health
expenditure
per
capita
that
is
very
related
with
previous
indicators.
Urban
population
refers
to
population
growth
and
change
in
cities
impact
on
the
relationships
people
have
with
others
and
their
sense
of
belonging
to
an
area.
The
concept
of
community
is
fundamental
to
people’s
overall
quality
of
life
and
sense
of
belonging.
In
fact
we
have
chosen
subsidies
and
other
transfers
like
an
indicator
of
quality
of
life
because
these
are
an
instrument
with
whom
government
reassign
wealth
among
people
of
a
country.
Public
expenditure
on
education
provides
an
insight
into
the
knowledge
and
skills
of
residents
and
how
they
can
apply
these
to
improve
their
quality
of
life.
Educational
achievement
is
essential
for
effective
participation
in
society.
The
last
indicator
is
unemployment:
a
reduction
of
this
indicator
helps
stimulate
further
opportunities
for
economic
growth
and
development
within
a
community
or
nation.
4. 4
The
considered
countries
(G20
countries
that
are
the
richest
one
in
the
world)
are:
Canada,
France,
Germany,
Japan,
Italy,
Russian
Federation,
United
States,
United
Kingdom,
Brazil,
China,
South
Africa,
Australia,
Saudi
Arabia,
South
Korea,
Indonesia,
Mexico,
Turkey,
Spain,
Netherlands.
The
source
of
data
is
the
World
data
Bank
in
the
section
of
World
Development
indicators(WDI).
The
year
chosen
to
extract
data
is
2008.
The
specific
software
used
on
this
project
are:
·∙ Gretl(regression)
·∙ R-‐Project
(factor
and
cluster
analysis)
·∙ Microsoft
Excel
(data
matrix
elaboration,
before
and
after
using
R)
We
have
numbered
X
from
1
to
8
in
relation
to
any
variable:
·∙ X1=CO2
emissions
(kg
per
2000
US$
of
GDP)
·∙ X2=Urban
population
·∙ X3=Health
expenditure
per
capita
(current
US$)
·∙ X4=Life
expectancy
at
birth,
total
(years)
·∙ X5=Unemployment,
total
(%
of
total
labor
force)
·∙ X6=Public
spending
on
education,
total
(%
of
GDP)
·∙ X7=Subsidies
and
other
transfers
(%
of
expense)
·∙ X8=Terrestrial
protected
areas
(%
of
total
land
area)
5. 5
Correlation
matrix
X1
X2
X3
X4
X5
X6
X7
X8
1,0000
0,4108
-‐0,6168
-‐0,7387
0,2370
-‐0,4123
-‐0,0290
-‐0,2151
X1
1,0000
-‐0,2571
-‐0,2300
-‐0,2166
-‐0,5982
-‐0,1159
-‐0,0277
X2
1,0000
0,6361
-‐0,2003
0,4932
0,3154
0,1806
X3
1,0000
-‐0,6507
0,2132
0,2230
0,2105
X4
1,0000
0,0424
-‐0,0984
-‐0,1525
X5
1,0000
0,0872
0,2719
X6
1,0000
0,1855
X7
1,0000
X8
We
can
see
from
the
data
that
there
is
not
a
very
high
correlation,
but
we
can
run
a
factor
analysis
since
there
are
some
correlations.
Using
R
we
have
found
this
values
that
refers
to
correlation
coefficient
of
Pearson.
So
we
can
conclude
that
there
is
a
strong
correlation
between
X4-‐X1
and
there
is
a
moderate
correlation
among
X1
and
X6-‐X3-‐X2,
between
X2-‐X6,
between
X3
and
X6-‐X4
and
finally
between
X4-‐X5.
We
have
considered
a
strong
correlation
if
corr
>
0.7
and
moderate
correlation
if
0.3
<
corr
<
0.7.
6. 6
REGRESSION
MODEL
Model
1:
OLS,
number
of
observations
1-‐20
Dependent
variable:
Life
expectancy
at
birth.
Coefficient
Std.
Error
t-‐ratio
p-‐value
Constant.
88,4781
8,19707
10,7939
<0,00001
***
CO2
emissions
kg
per
2000
US$
of
GDP
.
-‐3,18062
1,18728
-‐2,6789
0,02008
**
Urban
population.
-‐1,19832e-‐08
8,08775e-‐09
-‐1,4817
0,16421
Health
expenditure
per
capita.
0,00106495
0,000551237
1,9319
0,07732
*
Unemployment
total.
-‐0,903724
0,206679
-‐4,3726
0,00091
***
Public
spending
on
education.
-‐1,75829
1,13982
-‐1,5426
0,14888
Subsidies
and
other
transfers.
0,0396108
0,0953704
0,4153
0,68523
Terrestrial
protected
areas.
0,026664
0,0893965
0,2983
0,77060
R-‐squared
0,865092
R
(adjusted)
0,786395
P-‐value(F)
0,000221
With
the
software
Gretl
we
have
run
a
regression
of
our
data
using
OLS
regression
method.
Analyzing
R-‐squared
we
can
conclude
that
the
model
as
a
whole
is
very
good.
Also
P-‐value(F)
is
very
low
so
it
means
that
the
model
as
a
whole
is
very
significant
for
any
value
of
α.
The
dependent
variable
is
“life
expectancy
at
birth”
and
the
others
are
independent
variables.
The
7. 7
independent
variables
that
have
a
significant
p-‐value
are:
CO2
emissions,
health
expenditure
per
capita
and
unemployment.
Since
p-‐value
is
smaller
than
0.05,
we
reject
the
null
hypothesis
and
we
affirm
that
the
regressor
CO2
emissions
has
a
significant
impact
on
life
expectancy
at
birth
at
level
5%..
Since
p-‐value
is
smaller
than
0.1,
we
reject
the
null
hypothesis
and
we
affirm
that
the
regressor
health
expenditure
per
capita
has
a
significant
impact
on
life
expectancy
at
birth
at
level
10%..
Finally
since
p-‐value
is
smaller
than
0.01,
we
reject
the
null
hypothesis
and
we
affirm
that
the
regressor
unemployment
total
has
a
significant
impact
on
life
expectancy
at
birth
at
level
1%.
So
we
can
conclude
that
if
CO2
emissions
increase
of
1
Kg
per
2000
US$
of
GDP,
life
expectancy
at
birth
will
reduce
of
3,18062
years.
Another
conclusion
is
that
if
health
expenditure
per
capita
increases
of
1
current
US$,
life
expectancy
at
birth
will
increase
of
0,00106495
years.
Finally
if
unemployment
total
will
increase
of
1%
life
expectancy
at
birth
will
reduce
of
-‐0,903724
years.
FACTOR
ANALYSIS
In
order
to
run
a
factor
analysis
we
applied
the
“Principal
component
method”
by
using
R.
So
we
found
these
data
of
eigenvalues,
portion
of
variance(total)
and
cumulative
proportion
of
variance(total).
Eigenvalues
Portion
of
variance
(total)
Cumulative
proportion
of
variance(total)
3.13602447
0.3920031
0.3920031
1.59218446
0.1990231
0.5910261
1.06125308
0.1326566
0.7236828
0.88797144
0.1109964
0.8346792
0.55766918
0.06970865
0.90438783
0.48900580
0.06112573
0.96551355
0.19844296
0.02480537
0.99031892
0.07744861
0.009681076
1.000000000
8. 8
To
select
how
many
factors
to
use
we
considered
eigenvalues>
1
applying
“kaiser
criterium”,
so
we
dropped
all
components
with
eigenvalues
under
1.
Eigenvalue≅equivalent
number
of
variables
which
the
factor
represents.
Looking
at
the
table
we
can
see
that
with
3
eigenvalues,
the
factor
model
will
explain
72.37%
of
total
original
variability.
SCREE
PLOT
We
can
see
also
the
results
from
another
point
of
view
thanks
to
the
scree
plot.
This
test
puts
the
components
in
the
X
axis
and
the
corresponding
eigenvalues
in
the
Y-‐axis.
The
factor
loading
lij
is
the
covariance
between
the
j-‐th
common
factor
and
the
i-‐th
original
variable.
But
the
chosen
variables
are
standardized
so
it
coincides
with
the
correlation
between
the
j-‐th
common
factor
and
the
i-‐th
original
variable.
In
these
case
the
minimum
value
is
-‐1
(in
case
of
perfect
negative
correlation)
and
the
maximum
value
is
1
(in
case
of
perfect
positive
correlation).
Comp.1 Comp.3 Comp.5 Comp.7
.PC
Variances
0.00.51.01.52.02.53.0
9. 9
VARIANCE
EXPLAINED
BY
EACH
FACTOR
FACTOR
1
FACTOR
2
FACTOR
3
30.11%
22.34%
8.9%
The
portion
of
total
variability
explained
by
the
first
factor
is
2.409/8=30.11%
(ss
loading/sum
of
total
variance).
The
portion
of
total
variability
explained
by
the
second
factor
is
1.787/8=22.34%.
The
portion
of
total
variability
explained
by
the
third
factor
is
0.712/8=8.9%.
The
total
variance
explained
by
the
model
is
61.35%,
which
indicates
that
the
model
is
quite
good.
FACTOR
LOADING
MATRIX
Factor
1
Factor
2
Factor
3
CO2.emissions
(
X1)
-‐0.596
-‐0.349
-‐0.460
Health
expenditure
per
capita
(
X2)
0.532
0.430
0.334
Life
expectancy
at
birth
(
X3)
0.923
0.376
Public
spending.on
education
(
X4)
0.246
0.955
-‐0.148
Subsidies
and
other
transfers
of
expense
(
X5)
0.188
0.122
Terrestrial
protected
areas
(
X6)
0.237
0.216
Unemployment
(
X7)
-‐0.869
0.325
0.365
Urban
population
(
X8)
-‐0.106
-‐0.640
-‐0.274
SS
loadings
2.409
1.787
0.712
Proportion
Var
0.301
0.223
0.089
Cumulative
Var
0.301
0.525
0.614
10. 10
FINAL
ESTIMATION
OF
THE
COMMUNALITIES
communalities
Specific
variance
CO2.emissions
(
X1)
0,689
0,311
Health
expenditure
per
capita
(
X2)
0,58
0,42
Life
expectancy
at
birth
(
X3)
0,995
0,005
Public
spending
on
education
(
X4)
0,995
0,005
Subsidies
and
other
transfers
of
expense
(X5
)
0,0054
0,946
Terrestrial
protected
areas
(
X6)
0,105
0,895
Unemployment
(
X7)
0,995
0,005
Urban
population
(
X8)
0,496
0,504
Total
4,8604
By
the
final
estimation
of
the
communalities
we
can
see
that
there
are
5
communalities
that
well
explain
the
model
because
higher
than
50%
(these
communalities
refers
to
variables:
X1
,
X2,
X3,
X4,
X7).
There
are
also
3
communalities
that
don’t
explain
the
model
very
well
(these
communalities
refers
to
variables
X5,
X6,
X8)
.
In
fact
variables
with
high
communality
share
more
in
common
with
the
rest
of
the
variables.
Indeed
specific
variance
for
each
observed
variable
is
that
portion
of
the
variable
that
cannot
be
predicted
from
the
other
variables.
So
we
decided
that
after
,in
naming
factors,
we
will
not
consider
X5,
X6.
But
given
that
X8
has
a
communality
very
near
to
50%
we
can
consider
this
variable.
11. 11
Now
we
can
improve
the
interpretation
of
a
the
factors
by
applying
a
rotation
to
the
factor
loading
matrix.
ROTATED
VARIANCE
EXPLAINED
BY
EACH
FACTOR
(Total=61.36%)
FACTOR
1
FACTOR
2
FACTOR
3
26.02%
19.9%
15.44%
ROTATED
FACTOR
LOADING
MATRIX
(
varimax)
Factor
1
Factor
2
Factor
3
CO2.emissions
(
X1)
-‐0.772
-‐0.301
Health
expenditure
per
capita
(
X2)
0.645
0.402
Life
expectancy
at
birth
(
X3)
0.890
0.101
-‐0.439
Public
spending.on
education
(
X4)
0.154
0.984
Subsidies
and
other
transfers
of
expense
(
X5)
0.221
Terrestrial
protected
areas
(
X6)
0.143
0.260
-‐0.129
Unemployment
(
X7)
-‐0.260
0.962
Urban
population
(
X8)
-‐0.343
-‐0.537
-‐0.300
SS
loadings
2.082
1.592
1.235
Proportion
Var
0.260
0.199
0.154
Cumulative
Var
0.260
0.459
0.614
12. 12
It
is
clear
that
with
the
rotation
now
the
variance
explained
by
each
factor
is
well
distributed
and
mostable
factor
3
passes
from
8.9%
to
15.44%.
Furthermore
we
want
to
assign
a
label
to
each
factor
considering
the
more
significant
variables.
In
naming
the
label
of
latent
variables
we
have
considered
more
the
original
variables
with
communality>50%.
First
factor
is
mainly
explained
by
CO2
emissions,
health
expenditure
per
capita,
life
expectancy
at
birth
unemployment.
We
have
not
considered
subsidies
and
other
transfers
of
expense
and
terrestrial
protected
areas
because
they
have
communality<50%.
Second
factor
is
mainly
explained
by
public
spending
on
education
and
urban
population
but
only
the
first
has
a
communality>50%.
The
third
factor
is
explained
by
unemployment.
In
principal
components,
the
first
factor
describes
most
of
variability.
After
choosing
number
of
factors
to
retain,
we
want
to
spread
variability
among
factors
to
improve
the
interpretation.
So
we
consider
“rotated
factors”
that
have
a
better
distinction
in
the
meanings
of
the
factor.
NEW
LATENT
VARIABLES
ORIGINAL
VARIABLES
FACTOR
1
WELFARE
AND
WELL-‐BEING
CO2.emissions
(
X1)
Health
expenditure
per
capita
(
X2)
Life
expectancy
at
birth
(
X3)
Subsidies
and
other
transfers
of
expense
(
X5)
FACTOR2
PUBLIC
INTERVENTION
ON
POPULATION
Public
spending
on
education
(
X4)
Terrestrial
protected
areas
(
X6)
Urban
population
(
X8)
FACTOR3
UNEMPLYMENT
Unemployment
(
X7)
13. 13
CLUSTER
ANALYSIS
Now
we
want
to
analyze
how
we
can
cluster
the
countries
using
the
observations
of
real
variable
in
order
to
get
few
homogenous
groups.
We
compared
two
methods
of
clustering:
1.
hierarchical
method,
using
Euclidean
distance
and
the
ward
method;
2.
hierarchical
method,
using
Euclidean
distance
and
the
complete
linkage
method.
This
is
the
legend
of
countries:
1. Canada
2. France
3. Germany
4. Japan
5. Italy
6. RussianFederation
7. United
States
8. United
Kingdom
9. Brazil
10. China
11. India
12. South
Africa
13. Australia
14. Saudi
Arabia
15. Korea,
Rep.
16. Indonesia
17. Mexico
18. Turkey
19. Spain
20. Netherlands
14. 14
With R Software we have run an analysis to choose the number of clusters basing on the within
sum of squares computation. From this graph we see that we could have four clusters after cluster
analysis.
15. 15
In
this
cluster
analysis
we
have
used
the
ward
method
with
the
Euclidian
distance.
The
ward
method
is
a
non-‐hierarchical
method
based
on
the
ANOVA
approach.
Where
ANOVA
stands
for
ANalysis
Of
VAriance
table.
The graph suggests us that we can use 3 clusters because we can consider China like an isolated
country because has very few in common with other clusters.
Cluster 1: Usa, India. (7-11)
Cluster 2: Brazil, Mexico, Russia, Japan, Indonesia. (9-17-6-4-16)
Cluster 3: Canada, France, Germany, Italy, United Kingdom, South Africa, Australia, Saudi Arabia,
South Korea, Turkey, Spain, Netherland.(1-12-20-13-14-19-5-15-8-18-2-3)
16. 16
These are the means for each variable:
Cluster1
Cluster2
Cluster3
X1=CO2
emissions
(kg
per
2000
US$
of
GDP)
7.584599e-‐01
1.401193e+00
1.765399e+00
X2=Urban
population
3.652584e+07
1.153590e+08
4.082957e+08
X3=Health
expenditure
per
capita
(current
US$)
3.652584e+07
1.036108e+03
2.639784e+03
X4=Life
expectancy
at
birth,
total
(years)
7.691514e+01
7.343499e+01
7.173244e+01
X5=Unemployment,
total
(%
of
total
labor
force)
7.783333e+00
5.860000e+00
4.833333e+00
X6=Public
spending
on
education,
total
(%
of
GDP)
4.815916e+00
4.147186e+00
3.538987e+00
X7=Subsidies
and
other
transfers
(%
of
expense)
6.459847e+01
6.140823e+01
6.176835e+01
X8=Terrestrial
protected
areas
(%
of
total
land
area)
1.513201e+01
1.538366e+01
1.134538e+01
The
cluster
1
is
that
one
represents
more
variables.
It
is
composed
only
by
Usa
and
India.
This
cluster
seems
to
have
higher
values
in
health
expenditure,
life
expectancy,
unemployment,
public
spending
on
education
and
subsidies.
The
second
cluster
is
that
one
with
more
terrestrial
protected
areas.
Finally
the
third
cluster
has
the
higher
co2
emissions
and
urban
population,
but
we
can
see
also
that
is
the
cluster
formed
by
the
majority
of
elements.
17. 17
10
1
12
20
13
14
19
5
15
3
2
8
18
9
17
6
4
16
7
11
0e+001e+082e+083e+084e+085e+08
Cluster Dendrogram for Solution HClust.10
Method=average; Distance=euclidian
Observation Number in Data Set Dataset
Height
This
cluster
analysis
with
average
method
and
Euclidian
distance
give
us
a
result
worse
than
the
previous
analysis.
Now
we
have
10(China)
that
is
an
outlier
and
7
and
11(U.S.
and
India)
that
are
far
different
from
other
two
clusters.
18. 18
Without
7
9
10
11(U.S.
Brazil,
China,
India),
we
obtain
a
better
cluster
analysis
without
outlier.
Now
we
have
two
clusters,
the
first
composed
by Canada, France, Germany, Italy, United
Kingdom, South Africa, Australia, Saudi Arabia, South Korea, Turkey, Spain, Netherland.(1-12-20-
13-14-19-5-15-8-18-2-3). The second is composed by: Mexico, Russia, Japan, Indonesia. (17-6-4-
16) .
19. 19
CONCLUSION
The
initial
aim
of
this
research
was
to
find
a
possible
relationship
between
countries
belonging
to
G20.
After
cluster
and
factor
analysis
we
can
say
that
the
results
obtained
are
quite
interesting
since
the
factor
analysis
suggests
us
3
new
latent
variables
that
summarize
the
original
ones.
We
passed
from
11
original
variables
to
3
variables.
The
factor
analysis
produced
a
quite
satisfactory
result.
We
have
now
three
groups:
“welfare
and
well-‐being”,
“public
intervention”
and
“unemplyment”.
Also
cluster
analysis
produced
a
satisfactory
result.
We
can
find
some
common
characteristics
among
clusters.
We
can
note
that
cluster 2: Brazil, Mexico, Russia, Japan, Indonesia is
characterized by countries with an high population and apart Japan they are all developing
countries.
Cluster 3 Canada, France, Germany, Italy, United Kingdom, South Africa, Australia, Saudi Arabia,
South Korea, Turkey, Spain, Netherland is the cluster with all the European country that means is
the cluster with the higher welfare and equality of people inside clusters. We can also note that there
is the highest urban population but also the highest CO2 emissions.
It could be more difficult to discuss cluster 1 because is formed by 2 different countries. One the
U.S. is characterized by richness and is developed. Indeed India as a majority of poor population
and is a developing country. But we can also find some common points that could be public
spending on education because both India and U.S. have a good system of education.