2. AGENDA
• Data Collection and Data Introduction
• Data Pre-processing
• First Hypothesis
• First Model Analysis
• Second Hypothesis
• Second Model Analysis
• Conclusion
3. Data Collection and
Data Introduction
Observation
Number
199
Based on Country
Variables
Candidate
Number
12
GDP
Unemploym
-ent Rate
Population
Density
Immigration
Rate
Life
Expectancy
Hospital
Bed Density
Physicians
Density
Health
Expenditure
Obesity
Rate
Literacy
Death
Rate
Recovery
Rate
4. Data Pre-processing
How do we deal with the multiple data source ?
We use the country name as the key to merge multiple csv files
How do we deal with the missing value ?
We decided to use median to fill the missing values in certain
column, eliminating the impact of the outlier of dataset
5. H0: The Medical Resource will not have significant impact on
the recovery rate of coronavirus patients
H1: The Medical Resource will have significant impact on the
recovery rate of coronavirus patients
First Hypothesis
6. Summary of Data
X1: Physicians Density
Y: COVID-19 Recovery
Rate
X2: Hospital Bed Density
8. Multiple Regression Model Condition Check
Condition 1: The unobserved
errors 𝜀 are independent of
one another by plotting
residuals vs predicted values
Condition 2: The unobserved
errors 𝜀 have equal variance
by plotting residuals vs. Xi
Condition 3: The unobserved
errors 𝜀 are normally
distributed around the
regression equation by using
qq-plot, quantile of normal
distribution compare vs.
quantile of residuals
9. Collinearity Test
● Variance Inflation Factor (VIF) measures collinearity
● Finding VIF less than 5 indicates our variables are not
redundant
10. Model Explanation
● RMSE: 0.2263
● R-squared: 0.0007115
● F-statistic: 0.06978
● P-value: 0.9326>0.05
Cannot Reject H0 (Reject H1),
The Medical Resource will not have
significant impact on the recovery rate of
coronavirus patients.
11. Does the Hypothesis Work?
F-statistic:0.06978
P-value:0.9326
Alpha
: o.o5
H0: The Medical Resource will not have significant impact
on the recovery rate of coronavirus patients.
H1: The Medical Resource will have significant impact on
the recovery rate of coronavirus patients.
GDP
Unemploym
-ent Rate
Population
Density
Immigration
Rate
Life
Expectancy
Hospital
Bed Density
Physicians
Density
Health
Expenditure
Obesity
Rate
Literacy
Death
Rate
Recovery
Rate
Let’s look back at variables candidates
GDP
Life
Expectancy
Country’s Development Level
12. Second Hypothesis
Ho: the level of country’s
development have no association with
the COVID-19 recovery rate
H1: the level of country’s
development have association with
the COVID-19 recovery rate
13. Summary of Data
X1: Physicians density
X2: Hospital bed density
X3: GDP
X4: Life expectancy
Y: COVID-19 recovery rate
14. Exploratory Data Analysis
● Recovery rate vs. GDP: Pos. linear relationship
● Physicians density vs. H. bed density: Pos.
linear relationship
● Physicians density vs. GDP: Pos. linear
relationship
● H. bed density vs. GDP: Pos. linear
relationship
15. Multiple Regression Model Condition Check
Condition 1:
The unobserved errors 𝜀 are
independent of one another
by plotting residuals vs
predicted values
Condition 2:
The unobserved errors 𝜀
have equal variance by
plotting residuals vs. Xi
Condition 3:
The unobserved errors 𝜀 are
normally distributed around
the regression equation by
using qq-plot, quantile of
normal distribution compare
vs. quantile of residuals
16. Compare the Marginal Coefficient vs. Partial
Coefficient of Physicians Density
Marginal coefficient: -0.004209
Partial regression coefficient: -0.03711
In the MRM, physicians density has positive linear
relationship with variables h. bed density, GDP, and
life expectancy
17. Model Explanation
● RMSE: 0.2215
● R-squared: 0.05288
● F-statistic: 2.7088
● P-value: 0.03153
Reject H0,
the level of country’s development has an
association with the COVID-19 recovery
rate.
18. First
Hypothesis
Conclusion
There is no significance to reject H0 : The
Medical Resource will not have significant
impact on the recovery rate of coronavirus
patients
Second
Hypothesis
”
”
There is significance that we can the level of
country’s development has association with
the COVID-19 recovery rate (Reject Ho)
Recovery Rate = -0.2407 - 0.03711* Physicians
Density + 0.0004217 * Hospital Bed Density +
(9.47*10^(-7)) * GDP + 0.00722 & Life Expectancy
Formula