To validate our model, we need something to compare the model’s output to
Ideally, we would have the “truth” to compare the model to, but we just have observed data points, not the true underlying risk of maternal death
Instead, we can “hold back” some of the observed data and then see how well our model, fit to the remaining data, does in predicting the held back data points
We repeat steps 1-3 30 times for the test of gaps in time and countries with no data, to make sure our results are not an artifact of a given random sample
Sample 20% of the data, depending on what type of test you want to conduct
Estimate the model in the remaining 80% of data
Using the model from step (2), predict into the 20% hold-out sample
Calculate metrics of fit to determine how well the model did predicting the observed data in the 20% hold-out sample
9.
Predictive validity results: comparing the linear and spatio-temporal models 20% of Countries Regression Root Mean SE* Root Median SE Mean RE** Median RE Linear 214.84 27.00 0.604 0.417 Spatio-Temporal 189.27 25.34 0.521 0.357 First 20% of Country Years Regression Root Mean SE Root Median SE Mean RE Median RE Linear 208.28 22.04 0.702 0.437 Spatio-Temporal 129.32 11.92 0.392 0.199 Last 20% of Country Years Regression Root Mean SE Root Median SE Mean RE Median RE Linear 158.86 13.23 0.538 0.421 Spatio-Temporal 104.08 7.46 0.284 0.213 Random 20% of Country Years Regression Root Mean SE Root Median SE Mean RE Median RE Linear 215.44 24.22 0.619 0.419 Spatio-Temporal 125.34 10.36 0.286 0.165 * SE = Squared Error ** RE = Relative Error
Uncertainty is the “life preserver” for any researcher!
While uncertainty intervals are sometimes ignored by policy-makers, they are crucial when interpreting results
Identifying and incorporating all relevant types of uncertainty into uncertainty intervals in an empirical way is crucial
12.
What is the objective of uncertainty measurement? This line is the true, underlying risk of maternal death in a sample country, or the “expected value”
13.
What is the objective of uncertainty measurement? But we don’t observe that expected value; we observe particular data points
14.
What is the objective of uncertainty measurement? We want our uncertainty bounds to contain the expected value 95% of the time
Any data source will have some degree of associated stochastic sampling error, which must be reflected in any estimates of uncertainty
We capture this uncertainty by drawing from a binomial distribution with the observed maternal cause fraction as p and the number of trials ( n ) as the total number of observed deaths
We simulate 100 datasets by drawing from these distributions, and use these to propagate the sampling uncertainty through the modeling process
Make five draws from the variance-covariance matrix of the regression β s
Estimate the spatial-temporal model for each of these draws from the linear model
Make five draws from the variance-covariance matrix of each of the local regressions
20.
Parameter uncertainty: a simple example Here’s one potential model Here’s another potential model Parameter uncertainty takes into account the different models that could potentially fit the data
The fourth source we want to capture is the remaining systematic variation that our model does not explain
i.e. Education, fertility, etc and spatio-temporal relatedness do not explain all variation in maternal mortality
However, we cannot estimate the systematic variation directly; the remaining variation consists of three parts
Systematic variation
Stochastic variation
Non-sampling variation
22.
The leftover variation Non-sampling error Systematic error, but we don’t observe the true value This difference could be partially stochastic error, partially non-sampling error and partially non-sampling error
Wanted the dependent variable in the regression model to reflect non-AIDS-related maternal deaths only
Used unpublished UNAIDS tables on the proportion of total deaths of women aged 15-49 due to AIDS
Assume the fraction of AIDS deaths that occur during pregnancy that should be counted as maternal deaths, non-AIDS related maternal deaths depending on data source:
0.1 for pregnancy-related data points
0.5 for maternal data points
Use this non-AIDS-related PMDF as the dependent variable in the regression model
Given that the dependent variable was non-AIDS PMDF, after estimation, must estimate contribution of AIDS to maternal mortality, and add this back in
Move from non-AIDS PMDF to total PMDF
Assume that half of the estimated number of AIDS deaths that occur during pregnancy should be counted as maternal deaths
Assume the relative risk of dying from AIDS for a pregnant versus non-pregnant woman is 0.4
43.
IHME and the recent UN estimates IHME UN (H4) Data Sources 2651 2142
Vital Statistics
2186 2010
Surveys
204 819**
Census
46 19
Verbal Autopsy
215 113 Scope of Study
Time series
1980-2008 1990-2008
Countries
181 172 Correction
Misclassification
Country specific Correction factor 1.5 (63 countries)
Completeness
Country specific UN estimates Number of female deaths (15–49) Rajaratnam, 2010 WHO lifetables Estimate based on Model for all countries 118 model & 63 correction factor Model Linear + Space-time Multilevel Dependent variable MM rate (ln) by age group Fraction of MM (log) all ages Treatment of HIV Model-based Estimated deaths separately Covariates
GDP
yes yes
Education
yes no
TFR
yes yes
HIV
yes no
Health services
Neonatal mort SBA Model Validation yes no Uncertainty yes yes
Views
Actions
Embeds 0
Report content