SlideShare a Scribd company logo
1 of 47
Life Expectancy
Statistical Analysis
Group 3 – Nooshin & Helen
Which factor should a country give importance
in order to improve
the life expectancy of its population?
Data Collection Analysis Findings
Contents
• Life expectancy dataset from Kaggle website
https://www.kaggle.com/kumarajarshi/life-expectancy-who/version/1
193 16Countries Years 2000-2015
2,938x
1289Missing values
1,649Observations
-
• Life expectancy dataset from Kaggle website
https://www.kaggle.com/kumarajarshi/life-expectancy-who/version/1
2,938
Random sampling
75%Training Data Test Data
1,649
25%
Response variable: Life Expectancy
• Fairly normal
• Mean: 69.25
• Min: 44
• Max: 89
20 predictors: health, economic, immunization, mortality and social factorsHealth factors
• Alcohol consumed per capita
(in liters)
• Average BMI • Prevalence of thinness for Age 10-
19
• Prevalence of thinness for Age 5-9
Immunization factors
• % of immunization coverage
among 1-year-olds
• # cases per 1000 • % of immunization coverage
among 1-year-olds
• % of immunization coverage
among 1-year-olds
Economic factors
• % expenditure on health of GDP per capita • % government expenditure on
health
• GDP per capita • Index 0-1 in terms of income of
resources composition of
resources
Mortality factors
• Probability of dying between 15 and 60
years per 1000 population
• # Infant deaths per 1000 • # under-five deaths per 1000 • Deaths per 1000 live births
HIV/AIDS
Other factorsSocial factors
• # of years of schooling • Developed or developing country
• 10 out of 20 predictors are insignificant
• 11 variables do not show the random scatter patterns in plots of standardized
residuals against predictor.
Best Subset
Size Predictors
Adjusted
R-squared
AIC AICc BIC
1 x20 0.5364 4451.051 4,451.060 4,461.290
2 x14, x20 0.7263 3800.621 3,800.641 3,815.980
3 x3, x14, x20 0.7824 3518.318 3,518.351 3,538.797
4 x3, x14, x19, x20 0.8049 3384.114 3,384.163 3,409.712
5 x3, x9, x14, x19, x20 0.8110 3345.722 3,345.791 3,376.440
6 x3, x4, x10, x14, x19, x20 0.8172 3305.408 3,305.499 3,341.245
7 x3, x4, x6, x10, x14, x19, x20 0.8231 3265.85 3,265.968 3,306.807
8 x3, x4, x6, x9, x10, x14, x19, x20 0.8281 3231.72 3,231.867 3,277.797
• Adult Mortality
• Infant Deaths
• Percentage Expenditure
• BMI
• Under five Deaths
• HIV/AIDS
• Income
• Schooling
• Check multicollinearity using VIF
Remove x4
• Adult Mortality
• Percentage Expenditure
• BMI
• Under five Deaths
• HIV/AIDS
• Income
• Schooling
Life expectancy = 53.58 - .019 (Adult Mortality) + .0004 (Percentage
Expenditure) + .039 (BMI) - .003 (Under five Deaths) - .422 (HIV/AIDS)
+ 10.95 (Income composition of resource) + .924 (Schooling)
• Adult mortality, Under five deaths, and HIV/AIDS have negative impact on Life Expectancy
• Percentage Expenditure, BMI, Income composition, and Schooling have positive effect
• Income Composition has the largest effect on the Life Expectancy
• HIV/AIDS has the largest negative effect on life expectancy
Size Predictors
Adjusted
R-squared
AIC AICc BIC
1 x20 0.5364 4451.051 4451.07 4461.29
2 x14, x20 0.7263 3800.621 3800.654 3815.98
3 x3, x14, x20 0.7824 3518.318 3518.367 3538.797
4 x3, x14, x19, x20 0.8049 3384.114 3384.182 3409.712
5 x3, x9, x14, x19, x20 0.811 3345.722 3345.813 3376.44
6 x1, x3, x9, x14, x19, x20 0.8159 3314.762 3314.879 3350.599
7 x1, x3, x9, x14, x15, x19, x20 0.8185 3297.588 3297.735 3338.545
8 x1, x3, x5, x6, x9, x14, x19, x20 0.8208 3282.996 3283.176 3329.073
• Remove x4 due to high correlation with x10
• Best subset
• Year
• Adult Mortality
• Alcohol
• Percentage Expenditure
• BMI
• HIV/AIDS
• Income
• Schooling
Life expectancy = 34.45 – .146 (Year) - .018 (Adult Mortality) -.15 (Alcohol) + .00004
(Percentage Expenditure) + .041 (BMI) - .428 (HIV/AIDS)
+ 11.79 (Income composition of resource) + 1.059 (Schooling)
• Negative impact : Year, Adult mortality, Alcohol, and HIV/AIDS
• Positive effect: Percentage Expenditure, BMI, Income composition, and Schooling
• Life Expectancy increases by 11.79 years when Income Composition increases by 1 unit
(other predictors are kept fixed)
• Life Expectancy decreases by 0.4276 years when HIV/AIDS increases by 1 unit (other
predictors are kept fixed)
• Convert x4, x6, x8, x10 to categorical variables
0 24%
1-2 22%
3-30 35%
30-1600 20%
0 0.3%
0-100 45%
100-1000 41%
1000+ 13%
0 34%
1-10 14%
10-140K 52%
0 21%
1-2 21%
3-30 33%
30-2100 25%
• Box cox transformation
X3 = sqrt(x3)
X5 = sqrt(x5)
X7 = x7^3
X11 = x11^4
X13 = x13^4
X14 = 1/sqrt(x14)
X15 = log(x15)
X16 = log(x16)
X17 = log(x17)
X18 = log(x18)
• Best Subset
Subset size Predictors Adjusted
R-squared
AIC Corrected AIC BIC
1 x20 0.5364 4451.051 4,451.060 4,461.290
2 x14, x20 0.7263 3800.621 3,800.641 3,815.980
3 x3, x14, x20 0.7824 3518.318 3,518.351 3,538.797
4 x3, x14, x19, x20 0.8049 3384.114 3,384.163 3,409.712
5 x3, x9, x14, x19, x20 0.8110 3345.722 3,345.791 3,376.440
6 x3, x4, x10, x14, x19, x20 0.8172 3305.408 3,305.499 3,341.245
7 x3, x4, x6, x10, x14, x19, x20 0.8231 3265.85 3,265.968 3,306.807
8 X3, factor(x8), X13, X14, X15, X18, x19, x20 0.8281 3231.72 3,231.867 3,277.797
• Adult Mortality / sqrt(x3)
• Measles 10+
• Measles 1-10
• Diphtheria / x^4
• HIV/AIDS / 1/sqrt(x14)
• GDP / log(x15)
• Thinness 5-9 years / log(x18)
• Income
• Schooling
Fitted value
• Remove x4 due to high correlation with x10
• Box-cox transformation
X3 = sqrt(x3)
X5 = sqrt(x5)
X7 = x7^3
X11 = x11^4
X13 = x13^4
X14 = 1/sqrt(x14)
X15 = log(x15)
X16 = log(x16)
X17 = log(x17)
X18 = log(x18)
X6 = x6 ^.01
X8 = x8 ^.01
• Best Subset
Subset size Predictors Adjusted
R-squared
AIC Corrected AIC BIC
1 X14 0.56527 4371.459 4371.479 4381.699
2 X14, X19 0.7289 3788.639 3828.639 3803.998
3 X3, X14, X19 0.76394 3618.698 3618.746 3639.176
4 X3, X14, X19, X20 0.7810 3527.2 3527.269 3552.799
5 X3, X10, X14, X19, X20 0.7892 3480.741 3480.832 3511.459
6 X3, X10, X14, X18, X19, X20 0.79383 3454.336 3454.453 3490.174
7 X3, X6, X10, X14, X18, X19, X20 0.7977 3431.751 3431.898 3472.709
8 X3, X6, X10, X13, X14, X18, X19, X20 0.7999 3419.345 3419.525 3465.422
• Adult Mortality / sqrt(x3)
• Percentage Expenditure / x^0.01
• Under five deaths / x^0.01
• Diphtheria / x^4
• HIV/AIDS / 1/sqrt(x14)
• Thinness 5-9 years / log(x18)
• Income composition
• Schooling
Model Predictors
Adjusted
R-squared
AIC Corrected AIC BIC MAPE
1 x3, x6, x9, x10, x14, x19, x20 0.8178 3,302.552 3,302.67 3,343.509 4.05%
2 x1, x3, x5, x6, x9, x14, x19, x20 0.8208 3,282.996 3,283.143 3,329.073 4%
3 X3, factor(x8n), X13, X14, X15, X18, x19, x20 0.7974 3,286.474 3,286.621 3,332.551 4.46%
4 X3, X6, X10, X13, X14, X18, x19, x20 0.7999 3,419.345 3,419.525 3,465.422 4.4%
*MAPE = (
1
𝑛
| 𝑦−𝑦|
𝑦
) ∗ 100%
Life expectancy = 34.45 – .146 (Year) - .018 (Adult Mortality) -.15 (Alcohol) + .00004
(Percentage Expenditure) + .041 (BMI) - .428 (HIV/AIDS)
+ 11.79 (Income composition of resource) + 1.059 (Schooling)
• Economic, Social, and Health Factors have the largest effect on Life Expectancy
• A country having a lower Life Expectancy value should increase its health expenditure in order to improve its life span.
• We can not say densely populated countries tend to have lower Life Expectancy since this predictor is not statistically significant.
• Simon, J. (2009). A Modern Approach to Regression with R. New York, NY: Springer Science + Business Media, LLC
• Rajarshi, K. (2018). Life Expectancy (WHO). Statistical Analysis on factors influencing Life Expectancy. Retrieved from
https://www.kaggle.com/kumarajarshi/life-expectancy-who
Thank you!

More Related Content

Similar to Identify factors affecting life expectancy using R

S.M.A.R.T Final Presentation 2016
S.M.A.R.T Final Presentation 2016S.M.A.R.T Final Presentation 2016
S.M.A.R.T Final Presentation 2016
Alana Alston
 

Similar to Identify factors affecting life expectancy using R (20)

S.M.A.R.T Final Presentation 2016
S.M.A.R.T Final Presentation 2016S.M.A.R.T Final Presentation 2016
S.M.A.R.T Final Presentation 2016
 
Logistic regression1
Logistic regression1Logistic regression1
Logistic regression1
 
Redefining the care team to meet Population Health objectives
Redefining the care team to meet Population Health objectivesRedefining the care team to meet Population Health objectives
Redefining the care team to meet Population Health objectives
 
Rift Valley fever in Kenyan pastoral livestock: Individual-based demographic ...
Rift Valley fever in Kenyan pastoral livestock: Individual-based demographic ...Rift Valley fever in Kenyan pastoral livestock: Individual-based demographic ...
Rift Valley fever in Kenyan pastoral livestock: Individual-based demographic ...
 
Lessons from the past: How performance data availability and quality has led...
Lessons from the past:  How performance data availability and quality has led...Lessons from the past:  How performance data availability and quality has led...
Lessons from the past: How performance data availability and quality has led...
 
Heterogenous Impacts of Malawi’s Cash Transfer Programme by characteristics o...
Heterogenous Impacts of Malawi’s Cash Transfer Programme by characteristics o...Heterogenous Impacts of Malawi’s Cash Transfer Programme by characteristics o...
Heterogenous Impacts of Malawi’s Cash Transfer Programme by characteristics o...
 
Pediatric Emergency Care Applied Research Network (PECARN)
Pediatric Emergency Care Applied Research Network (PECARN)Pediatric Emergency Care Applied Research Network (PECARN)
Pediatric Emergency Care Applied Research Network (PECARN)
 
Cost Effectiveness in Radiology - By Jeffrey Shyu
Cost Effectiveness in Radiology - By Jeffrey ShyuCost Effectiveness in Radiology - By Jeffrey Shyu
Cost Effectiveness in Radiology - By Jeffrey Shyu
 
Session III - Census and Registers - S. Falorsi, A. Fasulo, Census and Soc...
Session III - Census and Registers  -  S. Falorsi, A. Fasulo,  Census and Soc...Session III - Census and Registers  -  S. Falorsi, A. Fasulo,  Census and Soc...
Session III - Census and Registers - S. Falorsi, A. Fasulo, Census and Soc...
 
Trends in mortality_v0.9
Trends in mortality_v0.9Trends in mortality_v0.9
Trends in mortality_v0.9
 
Tim Pletcher Presentation
Tim Pletcher PresentationTim Pletcher Presentation
Tim Pletcher Presentation
 
The Analytics Opportunity in Healthcare
The Analytics Opportunity in HealthcareThe Analytics Opportunity in Healthcare
The Analytics Opportunity in Healthcare
 
Tim Pletcher Presentation
Tim Pletcher PresentationTim Pletcher Presentation
Tim Pletcher Presentation
 
Osteoporosis 2016 | The SCOOP study – Do we now have a rationale to screen fo...
Osteoporosis 2016 | The SCOOP study – Do we now have a rationale to screen fo...Osteoporosis 2016 | The SCOOP study – Do we now have a rationale to screen fo...
Osteoporosis 2016 | The SCOOP study – Do we now have a rationale to screen fo...
 
Respondent Driven Sampling
Respondent Driven Sampling Respondent Driven Sampling
Respondent Driven Sampling
 
Continuously improving top tasks - Gerry McGovern
Continuously improving top tasks - Gerry McGovernContinuously improving top tasks - Gerry McGovern
Continuously improving top tasks - Gerry McGovern
 
Heterogeneous impacts of cash transfers on farm profitability
Heterogeneous impacts of cash transfers on farm profitabilityHeterogeneous impacts of cash transfers on farm profitability
Heterogeneous impacts of cash transfers on farm profitability
 
Effects of A Simulated Power Cut in AMS on Milk Yield Valued by Statistics Model
Effects of A Simulated Power Cut in AMS on Milk Yield Valued by Statistics ModelEffects of A Simulated Power Cut in AMS on Milk Yield Valued by Statistics Model
Effects of A Simulated Power Cut in AMS on Milk Yield Valued by Statistics Model
 
Epidemiological Exercise.ppsx
Epidemiological Exercise.ppsxEpidemiological Exercise.ppsx
Epidemiological Exercise.ppsx
 
Dr. Dave Rosero - Influence of Wean Age and Disease Challenge on Progeny Life...
Dr. Dave Rosero - Influence of Wean Age and Disease Challenge on Progeny Life...Dr. Dave Rosero - Influence of Wean Age and Disease Challenge on Progeny Life...
Dr. Dave Rosero - Influence of Wean Age and Disease Challenge on Progeny Life...
 

Recently uploaded

原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
pwgnohujw
 
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
acoha1
 
一比一原版(ucla文凭证书)加州大学洛杉矶分校毕业证学历认证官方成绩单
一比一原版(ucla文凭证书)加州大学洛杉矶分校毕业证学历认证官方成绩单一比一原版(ucla文凭证书)加州大学洛杉矶分校毕业证学历认证官方成绩单
一比一原版(ucla文凭证书)加州大学洛杉矶分校毕业证学历认证官方成绩单
aqpto5bt
 
Audience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptxAudience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptx
Stephen266013
 
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
zifhagzkk
 
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
acoha1
 
obat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di Bontang
obat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di  Bontangobat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di  Bontang
obat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di Bontang
siskavia95
 
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
jk0tkvfv
 

Recently uploaded (20)

原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
 
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
 
Predictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting TechniquesPredictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting Techniques
 
一比一原版(ucla文凭证书)加州大学洛杉矶分校毕业证学历认证官方成绩单
一比一原版(ucla文凭证书)加州大学洛杉矶分校毕业证学历认证官方成绩单一比一原版(ucla文凭证书)加州大学洛杉矶分校毕业证学历认证官方成绩单
一比一原版(ucla文凭证书)加州大学洛杉矶分校毕业证学历认证官方成绩单
 
Northern New England Tableau User Group (TUG) May 2024
Northern New England Tableau User Group (TUG) May 2024Northern New England Tableau User Group (TUG) May 2024
Northern New England Tableau User Group (TUG) May 2024
 
How to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data AnalyticsHow to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data Analytics
 
Seven tools of quality control.slideshare
Seven tools of quality control.slideshareSeven tools of quality control.slideshare
Seven tools of quality control.slideshare
 
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
 
SCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarj
SCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarjSCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarj
SCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarj
 
Audience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptxAudience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptx
 
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
 
What is Insertion Sort. Its basic information
What is Insertion Sort. Its basic informationWhat is Insertion Sort. Its basic information
What is Insertion Sort. Its basic information
 
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
 
NOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam DunksNOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam Dunks
 
Formulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdfFormulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdf
 
obat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di Bontang
obat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di  Bontangobat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di  Bontang
obat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di Bontang
 
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
 
Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"
 
Credit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital AgeCredit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital Age
 
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
 

Identify factors affecting life expectancy using R

  • 2. Which factor should a country give importance in order to improve the life expectancy of its population?
  • 3. Data Collection Analysis Findings Contents
  • 4. • Life expectancy dataset from Kaggle website https://www.kaggle.com/kumarajarshi/life-expectancy-who/version/1 193 16Countries Years 2000-2015 2,938x
  • 5. 1289Missing values 1,649Observations - • Life expectancy dataset from Kaggle website https://www.kaggle.com/kumarajarshi/life-expectancy-who/version/1 2,938
  • 6. Random sampling 75%Training Data Test Data 1,649 25%
  • 7. Response variable: Life Expectancy • Fairly normal • Mean: 69.25 • Min: 44 • Max: 89
  • 8. 20 predictors: health, economic, immunization, mortality and social factorsHealth factors • Alcohol consumed per capita (in liters) • Average BMI • Prevalence of thinness for Age 10- 19 • Prevalence of thinness for Age 5-9
  • 9. Immunization factors • % of immunization coverage among 1-year-olds • # cases per 1000 • % of immunization coverage among 1-year-olds • % of immunization coverage among 1-year-olds
  • 10. Economic factors • % expenditure on health of GDP per capita • % government expenditure on health • GDP per capita • Index 0-1 in terms of income of resources composition of resources
  • 11. Mortality factors • Probability of dying between 15 and 60 years per 1000 population • # Infant deaths per 1000 • # under-five deaths per 1000 • Deaths per 1000 live births HIV/AIDS
  • 12. Other factorsSocial factors • # of years of schooling • Developed or developing country
  • 13.
  • 14.
  • 15.
  • 16.
  • 17. • 10 out of 20 predictors are insignificant • 11 variables do not show the random scatter patterns in plots of standardized residuals against predictor. Best Subset
  • 18. Size Predictors Adjusted R-squared AIC AICc BIC 1 x20 0.5364 4451.051 4,451.060 4,461.290 2 x14, x20 0.7263 3800.621 3,800.641 3,815.980 3 x3, x14, x20 0.7824 3518.318 3,518.351 3,538.797 4 x3, x14, x19, x20 0.8049 3384.114 3,384.163 3,409.712 5 x3, x9, x14, x19, x20 0.8110 3345.722 3,345.791 3,376.440 6 x3, x4, x10, x14, x19, x20 0.8172 3305.408 3,305.499 3,341.245 7 x3, x4, x6, x10, x14, x19, x20 0.8231 3265.85 3,265.968 3,306.807 8 x3, x4, x6, x9, x10, x14, x19, x20 0.8281 3231.72 3,231.867 3,277.797
  • 19. • Adult Mortality • Infant Deaths • Percentage Expenditure • BMI • Under five Deaths • HIV/AIDS • Income • Schooling
  • 20. • Check multicollinearity using VIF Remove x4
  • 21. • Adult Mortality • Percentage Expenditure • BMI • Under five Deaths • HIV/AIDS • Income • Schooling
  • 22.
  • 23.
  • 24. Life expectancy = 53.58 - .019 (Adult Mortality) + .0004 (Percentage Expenditure) + .039 (BMI) - .003 (Under five Deaths) - .422 (HIV/AIDS) + 10.95 (Income composition of resource) + .924 (Schooling) • Adult mortality, Under five deaths, and HIV/AIDS have negative impact on Life Expectancy • Percentage Expenditure, BMI, Income composition, and Schooling have positive effect • Income Composition has the largest effect on the Life Expectancy • HIV/AIDS has the largest negative effect on life expectancy
  • 25. Size Predictors Adjusted R-squared AIC AICc BIC 1 x20 0.5364 4451.051 4451.07 4461.29 2 x14, x20 0.7263 3800.621 3800.654 3815.98 3 x3, x14, x20 0.7824 3518.318 3518.367 3538.797 4 x3, x14, x19, x20 0.8049 3384.114 3384.182 3409.712 5 x3, x9, x14, x19, x20 0.811 3345.722 3345.813 3376.44 6 x1, x3, x9, x14, x19, x20 0.8159 3314.762 3314.879 3350.599 7 x1, x3, x9, x14, x15, x19, x20 0.8185 3297.588 3297.735 3338.545 8 x1, x3, x5, x6, x9, x14, x19, x20 0.8208 3282.996 3283.176 3329.073 • Remove x4 due to high correlation with x10 • Best subset
  • 26. • Year • Adult Mortality • Alcohol • Percentage Expenditure • BMI • HIV/AIDS • Income • Schooling
  • 27.
  • 28.
  • 29.
  • 30.
  • 31. Life expectancy = 34.45 – .146 (Year) - .018 (Adult Mortality) -.15 (Alcohol) + .00004 (Percentage Expenditure) + .041 (BMI) - .428 (HIV/AIDS) + 11.79 (Income composition of resource) + 1.059 (Schooling) • Negative impact : Year, Adult mortality, Alcohol, and HIV/AIDS • Positive effect: Percentage Expenditure, BMI, Income composition, and Schooling • Life Expectancy increases by 11.79 years when Income Composition increases by 1 unit (other predictors are kept fixed) • Life Expectancy decreases by 0.4276 years when HIV/AIDS increases by 1 unit (other predictors are kept fixed)
  • 32.
  • 33. • Convert x4, x6, x8, x10 to categorical variables 0 24% 1-2 22% 3-30 35% 30-1600 20% 0 0.3% 0-100 45% 100-1000 41% 1000+ 13% 0 34% 1-10 14% 10-140K 52% 0 21% 1-2 21% 3-30 33% 30-2100 25%
  • 34. • Box cox transformation X3 = sqrt(x3) X5 = sqrt(x5) X7 = x7^3 X11 = x11^4 X13 = x13^4 X14 = 1/sqrt(x14) X15 = log(x15) X16 = log(x16) X17 = log(x17) X18 = log(x18)
  • 35. • Best Subset Subset size Predictors Adjusted R-squared AIC Corrected AIC BIC 1 x20 0.5364 4451.051 4,451.060 4,461.290 2 x14, x20 0.7263 3800.621 3,800.641 3,815.980 3 x3, x14, x20 0.7824 3518.318 3,518.351 3,538.797 4 x3, x14, x19, x20 0.8049 3384.114 3,384.163 3,409.712 5 x3, x9, x14, x19, x20 0.8110 3345.722 3,345.791 3,376.440 6 x3, x4, x10, x14, x19, x20 0.8172 3305.408 3,305.499 3,341.245 7 x3, x4, x6, x10, x14, x19, x20 0.8231 3265.85 3,265.968 3,306.807 8 X3, factor(x8), X13, X14, X15, X18, x19, x20 0.8281 3231.72 3,231.867 3,277.797
  • 36. • Adult Mortality / sqrt(x3) • Measles 10+ • Measles 1-10 • Diphtheria / x^4 • HIV/AIDS / 1/sqrt(x14) • GDP / log(x15) • Thinness 5-9 years / log(x18) • Income • Schooling
  • 37.
  • 39. • Remove x4 due to high correlation with x10 • Box-cox transformation X3 = sqrt(x3) X5 = sqrt(x5) X7 = x7^3 X11 = x11^4 X13 = x13^4 X14 = 1/sqrt(x14) X15 = log(x15) X16 = log(x16) X17 = log(x17) X18 = log(x18) X6 = x6 ^.01 X8 = x8 ^.01
  • 40. • Best Subset Subset size Predictors Adjusted R-squared AIC Corrected AIC BIC 1 X14 0.56527 4371.459 4371.479 4381.699 2 X14, X19 0.7289 3788.639 3828.639 3803.998 3 X3, X14, X19 0.76394 3618.698 3618.746 3639.176 4 X3, X14, X19, X20 0.7810 3527.2 3527.269 3552.799 5 X3, X10, X14, X19, X20 0.7892 3480.741 3480.832 3511.459 6 X3, X10, X14, X18, X19, X20 0.79383 3454.336 3454.453 3490.174 7 X3, X6, X10, X14, X18, X19, X20 0.7977 3431.751 3431.898 3472.709 8 X3, X6, X10, X13, X14, X18, X19, X20 0.7999 3419.345 3419.525 3465.422
  • 41. • Adult Mortality / sqrt(x3) • Percentage Expenditure / x^0.01 • Under five deaths / x^0.01 • Diphtheria / x^4 • HIV/AIDS / 1/sqrt(x14) • Thinness 5-9 years / log(x18) • Income composition • Schooling
  • 42.
  • 43.
  • 44. Model Predictors Adjusted R-squared AIC Corrected AIC BIC MAPE 1 x3, x6, x9, x10, x14, x19, x20 0.8178 3,302.552 3,302.67 3,343.509 4.05% 2 x1, x3, x5, x6, x9, x14, x19, x20 0.8208 3,282.996 3,283.143 3,329.073 4% 3 X3, factor(x8n), X13, X14, X15, X18, x19, x20 0.7974 3,286.474 3,286.621 3,332.551 4.46% 4 X3, X6, X10, X13, X14, X18, x19, x20 0.7999 3,419.345 3,419.525 3,465.422 4.4% *MAPE = ( 1 𝑛 | 𝑦−𝑦| 𝑦 ) ∗ 100%
  • 45. Life expectancy = 34.45 – .146 (Year) - .018 (Adult Mortality) -.15 (Alcohol) + .00004 (Percentage Expenditure) + .041 (BMI) - .428 (HIV/AIDS) + 11.79 (Income composition of resource) + 1.059 (Schooling) • Economic, Social, and Health Factors have the largest effect on Life Expectancy • A country having a lower Life Expectancy value should increase its health expenditure in order to improve its life span. • We can not say densely populated countries tend to have lower Life Expectancy since this predictor is not statistically significant.
  • 46. • Simon, J. (2009). A Modern Approach to Regression with R. New York, NY: Springer Science + Business Media, LLC • Rajarshi, K. (2018). Life Expectancy (WHO). Statistical Analysis on factors influencing Life Expectancy. Retrieved from https://www.kaggle.com/kumarajarshi/life-expectancy-who