Digital Demography
Bogdan State & Ingmar Weber
@bogdanstate @ingmarweber
https://sites.google.com/site/digitaldemography/
Part I: Traditional Demographic Methods
Population Equation
• Change in population = Inputs – Outputs
• Inputs = Births + In-migration
• Outputs = Deaths + Out-Migration
• ∆P = (B + I) − (D + O)
Population Equation
• ∆P = (B + I) − (D + O)
• Fundamental questions of demographic research:
• Fertility
• Mortality
• Migration
Fertility
• “How many newborns?”
• Deceptively simple question
• Factors behind fertility:
• Biological
• E.g. decline in both male (Andersson et al. 2007) and female (Crain et al., 2008)
reproductive health
• Social
• E.g. desired family size
Mortality
• “How many deaths?”
• Epidemiological / public health questions:
• Infectious diseases
• Lifestyle diseases
• War / natural disasters
• “Global Burden of Disease” (Murray and Lopez, 1997): “Medically certified information
is available for less than 30% of the estimated 50.5 million deaths that occur each year
worldwide.”
Mortality Rates
(Marmot, 2005)
Social Determinants of Mortality
• Inequality deeply connected with mortality, especially mortality from preventable
causes.
• “although it might be obvious that poverty is at the root of much of the problem of
infectious disease, and needs to be solved, it is less obvious how to break the link
between poverty and disease. Income poverty provides, at best, an incomplete
explanation of differences in mortality among countries.” (Marmot, 2005)
Social Determinants of Mortality
(Marmot, 2005)
Social Determinants of Mortality
(Marmot, 2005)
Life Tables
• - For a closed population, fertility and mortality = entry and exit
• - How to transform event data into state data?
• ” The ordinary life table is a device for following a closed group of people, born at the same
time, as it decreases in size until the death of its last member. The emphasis is put on the
nonreversible transition from one state (being alive) to another (being dead).” (Ledent, 1980)
• Events are coarsened (e.g. to year-level), but desired estimates must be continuous
• What to do when data available at different granularities?
• Can also be used for repeated events, and with open populations.
• Endless methodological variation.
• Not assumption free!
Life Tables
• lx – number living at age x
• qx – probability of death
• Lx – people alive at age x+1
• ex – life expectancy
Hazard Curves
• - Representing age-specific death rate:
• - “bath-tub curve” over lifecourse:
• - infant mortality
• - accelerating mortality at end of lifespan
• - Different countries have different hazard curves!
• (Vanderbloemen, Dorling and Minton, 2016)
Changes in Infant Mortality
(Lopez et al., 2002)
Adult Mortality
(Weon and Je, 2012)
Demographic Transition
• Typical Western country, great-grandparents’ generation, large families.
• Shift over time from high infant mortality, high-birthrate regime to low-infant
mortality, low-birth-rate. (Notestein, 1945)
• “At present, there are barely a dozen countries that have not begun a fertility
decline.” (Lesthaeghe, 2010)
• Two regimes: high- and low- fertility:
- key difference is economic returns on fertility restriction (Caldwell, 1976).
- Social and economic transformations happen at once (Caldwell, 1976)
• “U curve” to “J curve”
Demographic Transition: Factors
• - Advances in medicine (infectious disease)
• - Food security
• - Contraception
- State policies (e.g. one-child policy)
- Changing attitudes
Demographic Transition: Consequences
• Wide population pyramid = glut of young people
• Resource pressure
• Future working-age population
• Shift in care-giving patterns, and family structures
• Population stabilization: how fast key to forecasts
• Fundamental importance to world economy / history!
After the Demographic Transition
• Large generations moving through the population pyramid
• Pyramid shifting towards narrower, blockier shape
• Possible reverberations, “echo booms” (Easterlin, 1973, cyclical fertility)
• J-curve also widens: later age of death
• Large populations of older individuals, “greying” of societies (e.g. Italy, Japan)
After the Demographic Transition
(Marmot, 2005)
Second Demographic Transition
• (From Lesthaeghe, 2010)
• Western countries, below replacement fertility
• Multitude of living arrangements other than marriage
• Disconnection between marriage and procreation
• No stationary populations
• Need immigration to maintain stationarity
• Divorce and remarriage, sexual revolution
• Open to debate whether one transition or two
(Lesthaege, 2010)
Migration
• - complicates life tables: entry events at every age
• - may have different age-specific death rates
• - unit of analysis: international vs internal migration
Who is a migrant?
• Legal vs. administrative realities:
• U.S.: Non-resident alien with fiscal residency
• Regular and irregular internal movement (e.g. China, former Soviet Union)
• Repeat and seasonal migration
• Proper units of analysis: Boundary-spanning urban clusters
• Time:
• Distinguishing migration from tourism
• Frequent international travel
•
Special migrant categories
• Agricultural workers (seasonal)
• Highly-skilled migrants:
• Health-care professionals
• Knowledge workers
• Undocumented migrants
• Asylum seekers
• Environmentally-displaced people
• Human trafficking
• Retirees
• Super-commuters
• What do these all have in common?
Measuring migrations
• Hard-to-measure populations
• Legal high stakes – measurement can have consequences, e.g. undocumented migration, human trafficking
• Fuzzy definitions: highly-skilled, super-commuters
• Hard-to-reach: universal
• No uniform definition:
• Minimum 90 days stay vs. 183 days in a year?
• Contiguous vs. non-contiguous stays
• When is a move a move?
• Forecasting is highly complex
Data Sources for Demographic Research
• Population censuses
• Population register
• Small-sample surveys
Population Censuses
• Interview-based approach, door-to-door
• Snapshot state of population – “dimension table”
• 190 countries conducting censuses during 2000 round (UNSTATS)
• Political implications:
• - Redistricting
• - Power-sharing agreements
• - Vulnerable social groups
Population Censuses: Data Reliability
• Politics => potential data reliability issues:
• - Missing censuses (e.g. Lebanon – see Soffer, 1986)
• - Missing questions (e.g. sexual orientation, religion, etc.)
• - e.g. England and Wales census first asking about religion in 2001, see Southworth, 2005)
• - in the U.S. census, “Prior to 1990, couples living outside of marriage in marriage-like
relationships were not identified separately from individuals living together as roommates.”
(Black et al., 1999)
• - Missing people (migrants, LGBT, religious and ethnic minorities)
• - e.g. Fenton et al., 2001; Goldstein and Morning, 2000
Population Registers
• Theory:
• - national identification scheme, unique personal ID
• - use new events to update individuals’ state
• Practice:
• - Missing data!
- Administrative registration not same as real event
- But, it’s possible! E.g., “Subjects were randomly selected from 10 year age and sex
strata in the entire Swedish population using the continuously updated computerised
population register, and were frequency matched so that their age and sex
distributions resembled those of the cancer cases” (Lagergren, Bergström, Nyrén,
2000)
Small-sample surveys
- Canonical example: American Community Survey
- Many subject-specific surveys that can be used to derive demographic
quantities.
- Ranging from pure public health to pure sociology
- E.g. National Health and Nutrition Examination Survey, National Epidemiologic Survey
on Alcohol and Related Conditions US.
- Compare: ADD Health, National Time-Use Survey, General Social Survey (U.S.)
- Of particular relevance: DHS studies
DHS Surveys
• “The Demographic and Health
Surveys (DHS) Program has collected,
analyzed, and disseminated accurate
and representative data on
population, health, HIV, and
nutrition through more than 300
surveys in over 90 countries.” See
more at:
http://dhsprogram.com/#sthash.F0V
yJww6.dpuf
Dissemination mechanisms
- Privacy preservation: how to do it?
- Microdata
- Census data tables
- Reports
- Maps
Small-sample surveys: Data Reliability
- Much lower statistical power
- Issue of convenience sampling (see: Framingham Heart Study)
- Often not as consistent as consistently executed as a census.
Modelling approaches
• Population level:
• Event count models
• Age-period-cohort models
• (Time-series models)
•
• Individual level:
• Time-variant vs. invariant covariates
• Repeatable vs. non-repeatable event
(marriage vs. death)
• Linear model
• Panel dataset: clustered time-series
• Survival analysis
How to deal with time?
Linear models
- Standard social science method.
- In a cross-sectional dataset, who experiences a non-repeatable event?
(e.g. death)
- Can use logistic regression, probit, etc.
- Can also use ordinary least squares regression for Gaussian response
variables, e.g. income.
- Stuff gets complicated fast: WLS, GLS, FGLS, etc.
Event-count models
- What to do about repeatable events?
- Winkelmann and Zimmerman (1994) – for modelling, can use:
- linear model
- Tobit model (DV is positive)
- Ordinary logit / probit
- Event count model, e.g. Poisson Regression, Negative Binomial,
Generalized Event Count, Zero-Inflated Poisson
Age-Period-Cohort Models
- Setting: (year_of_observation, age, death_rate)
- Effects:
- Age: how much time elapsed from birth until time of observation? (e.g. death rates
much higher above 85 years of age)
- Period: effect specific to the time of observation (e.g. death rates much higher during a
war)
- Cohort: effect specific to the time of birth (e.g. Eastern European 1986 birth cohort)
- So why not just regress: Death_rate ~ Age + Period + Cohort?
Age-Period-Cohort Models
- Yang, Fu and Land (2004): “the identification problem [in APC analysis] still
remains largely unsolved.”
- Two solutions:
- GLM models, w/ either logarithmic or logit link function. Can be reparameterized as
fixed effects models.
- Intrinsic estimator based on SVD
- For microdata, can use mixed effects(Yang and Land, 2006)!
- Hacky solution: lme4: death_rate ~ age + age^2 + (1 | period) + (1 | cohort)
(Yang, Fu, Land, 2004)
Panel Models
- Following the same individual unit over time, repeated measurements
- Time-varying vs. time-invariant covariates
- Different specifications:
- Fixed effects – ”within-subjects”: e.g., remove the mean, remove unobserved time-invariant
covariates.
- Random effects – “between-subjects” comparison, valid if no unobserved covariates.
- Difference-in-Differences: observe effect of change in covariates on change in DV.
- PVAR: panel vector auto-regression: modelling feedback loops. (Love and Zicchino, 2004)
Survival Analysis
- Censored data:
- Left truncation: because of when data collection was started
- Right censoring: can’t tell the future!
• - Kaplan Meier estimates
• - Cox proportional hazards
- Accelerated Failure Time models
- http://data.princeton.edu/pop509/ParametricSurvival.pdf
Kaplan-Meier Estimates
- Estimate an empirical survival function
- Estimate variance under certain assumptions
^ source: wikipedia
https://rstudio-pubs-static.s3.amazonaws.com/5588_72eb65bfbe0a4cb7b655d2eee0751
Cox Proportional Hazards Model
• - Modelling hazard curves, rather than instantaneous probabilities
- Key: proportional hazards assumption – shift baseline function on y axis.
- Underlying hazard distribution: Weibull, Gompertz, Gamma, log-normal, mixture
- Can even estimate empirical underlying hazard function
- Accommodates time-variant covariates
- R package: survival (coxph)
- https://www.r-bloggers.com/cox-proportional-hazards-model/
- Fox and Weisberg, 2011
Accelerated Failure Time Model
• T = time to failure
• W = error term, e.g. log-normal, log-logistic, Weibull, exponential, extreme-value, Gaussian
• For Weibull distribution same as Cox Proportional Hazards model! (Kalbfleisch and Prentice, 2002)
• Source: http://data.princeton.edu/pop509/ParametricSurvival.pdf
Geographic Units
• - Problem with national boundaries: not consistent meaning
- E.g.: person moving from Luxembourg 30 km into the Netherlands vs
person moving from Xi’an to Beijing.
- Same applies at subnational level:
- E.g., U.S.: states, counties, urban areas, ZIP codes, congressional districts,
school districts, etc.
- Solution: standardization
Geographic Units
- PUMA (US): Public Use Micro-Data Area
- NUTS (Europe)
- No worldwide standard
http://ec.europa.eu/eurostat/web/nuts/overview
http://www.statcan.gc.ca/pub/92-195-x/2011001/other-autre/hierarch/
References
• Andersson, A. M., et al. (2008). Adverse trends in male reproductive health:
We may have reached a crucial “tipping point.” International Journal of
Andrology, 31(2), 74–80.
• Black, D., Gates, G., & Sanders, S. (1999). Demographics of the Gay and
Lesbian Population in the United States : Evidence from Available Systematic
Date Sources.
• Crain, D. A., et al. (2008). Female reproductive disorders : the roles of
endocrine- disrupting compounds and developmental timing, Fertility and
Sterility 90(4).
References
• Easterlin, R. A., Planning, F., & Mar, N. (2007). An Economic Framework for
Fertility Analysis, 6(3), 54–63.
• Fenton, K. A., Johnson, A. M., Mcmanus, S., Erens, B., & Free, R. (2001).
Series editors Measuring sexual behaviour : methodological challenges in
survey research, 84–92.
• Goldstein, J. R., & Morning, A. J. (2000). The multiple-race population of the
United States: Issues and estimates. Proceedings of the National Academy
of Sciences, 2000(97), 11.
References
• Lagergren, J., Bergström, R., & Nyrén, O. (2000). No relation between body
mass and gastro-oesophageal reflux symptoms in a Swedish population
based study, (December 1994), 26–29.
• Land, K. C., & Yang, Y. (2006). A Mixed Models Approach to the Age-Period-
Cohort Analysis of Repeated Cross-Section Surveys, with an application to
data on trends in verbal test scores. Sociological Methodology, (December
2006).
• Lesthaeghe, R. J. (2010). The Unfolding Story of Transition. Population and
Development Review, 36(2), 211–251.
References
• Marmot, M. (2005). Social determinants of health inequalities. Lancet,
365(9464), 1099–1104.
• Murray, C. J. L., & Lopez, A. D. (1997). Alternative projections of mortality
and disability by cause 1990-2020: Global Burden of Disease Study. Lancet,
349(9064), 1498–1504.
• Soffer, A. (1986). Lebanon: Where Demography Is the Core of Politics and
Life. Middle Eastern Studies, 22(2), 197–205.
References
• Southworth, J. R., & John, W. C. (2005). “Religion” in the 2001 Census for
England and Wales, 88, 75–88.
• Vanderbloemen, L., Dorling, D., & Minton, J. (2016). Visualising variation in
mortality rates across the life course and by sex , USA and comparator
states , 1933 – 2010, 826–831.
• Weong, B. M., & Je, J. H. (2012). Trends in scale and shape of survival
curves. Cientific Reports, 2, 504.
References
• Winkelmann, R., & Zimmermann, K. F. (1994). Count data models for
demographic data. Math Popul Stud, 4(3), 205.
• Yang, Y., Fu, W., & Land, K. C. (2004). A Methodological Comparison of Age-
Period- Cohort Models : The Intrinsic Estimator and Conventional
Generalized Linear Models. Sociological Methodology, (December 2004).

Digital Demography - WWW'17 Tutorial - Part I

  • 1.
    Digital Demography Bogdan State& Ingmar Weber @bogdanstate @ingmarweber https://sites.google.com/site/digitaldemography/
  • 2.
    Part I: TraditionalDemographic Methods
  • 3.
    Population Equation • Changein population = Inputs – Outputs • Inputs = Births + In-migration • Outputs = Deaths + Out-Migration • ∆P = (B + I) − (D + O)
  • 4.
    Population Equation • ∆P= (B + I) − (D + O) • Fundamental questions of demographic research: • Fertility • Mortality • Migration
  • 5.
    Fertility • “How manynewborns?” • Deceptively simple question • Factors behind fertility: • Biological • E.g. decline in both male (Andersson et al. 2007) and female (Crain et al., 2008) reproductive health • Social • E.g. desired family size
  • 6.
    Mortality • “How manydeaths?” • Epidemiological / public health questions: • Infectious diseases • Lifestyle diseases • War / natural disasters • “Global Burden of Disease” (Murray and Lopez, 1997): “Medically certified information is available for less than 30% of the estimated 50.5 million deaths that occur each year worldwide.”
  • 7.
  • 8.
    Social Determinants ofMortality • Inequality deeply connected with mortality, especially mortality from preventable causes. • “although it might be obvious that poverty is at the root of much of the problem of infectious disease, and needs to be solved, it is less obvious how to break the link between poverty and disease. Income poverty provides, at best, an incomplete explanation of differences in mortality among countries.” (Marmot, 2005)
  • 9.
    Social Determinants ofMortality (Marmot, 2005)
  • 10.
    Social Determinants ofMortality (Marmot, 2005)
  • 11.
    Life Tables • -For a closed population, fertility and mortality = entry and exit • - How to transform event data into state data? • ” The ordinary life table is a device for following a closed group of people, born at the same time, as it decreases in size until the death of its last member. The emphasis is put on the nonreversible transition from one state (being alive) to another (being dead).” (Ledent, 1980) • Events are coarsened (e.g. to year-level), but desired estimates must be continuous • What to do when data available at different granularities? • Can also be used for repeated events, and with open populations. • Endless methodological variation. • Not assumption free!
  • 12.
    Life Tables • lx– number living at age x • qx – probability of death • Lx – people alive at age x+1 • ex – life expectancy
  • 13.
    Hazard Curves • -Representing age-specific death rate: • - “bath-tub curve” over lifecourse: • - infant mortality • - accelerating mortality at end of lifespan • - Different countries have different hazard curves! • (Vanderbloemen, Dorling and Minton, 2016)
  • 14.
    Changes in InfantMortality (Lopez et al., 2002)
  • 15.
  • 16.
    Demographic Transition • TypicalWestern country, great-grandparents’ generation, large families. • Shift over time from high infant mortality, high-birthrate regime to low-infant mortality, low-birth-rate. (Notestein, 1945) • “At present, there are barely a dozen countries that have not begun a fertility decline.” (Lesthaeghe, 2010) • Two regimes: high- and low- fertility: - key difference is economic returns on fertility restriction (Caldwell, 1976). - Social and economic transformations happen at once (Caldwell, 1976) • “U curve” to “J curve”
  • 17.
    Demographic Transition: Factors •- Advances in medicine (infectious disease) • - Food security • - Contraception - State policies (e.g. one-child policy) - Changing attitudes
  • 18.
    Demographic Transition: Consequences •Wide population pyramid = glut of young people • Resource pressure • Future working-age population • Shift in care-giving patterns, and family structures • Population stabilization: how fast key to forecasts • Fundamental importance to world economy / history!
  • 19.
    After the DemographicTransition • Large generations moving through the population pyramid • Pyramid shifting towards narrower, blockier shape • Possible reverberations, “echo booms” (Easterlin, 1973, cyclical fertility) • J-curve also widens: later age of death • Large populations of older individuals, “greying” of societies (e.g. Italy, Japan)
  • 20.
    After the DemographicTransition (Marmot, 2005)
  • 21.
    Second Demographic Transition •(From Lesthaeghe, 2010) • Western countries, below replacement fertility • Multitude of living arrangements other than marriage • Disconnection between marriage and procreation • No stationary populations • Need immigration to maintain stationarity • Divorce and remarriage, sexual revolution • Open to debate whether one transition or two
  • 22.
  • 23.
    Migration • - complicateslife tables: entry events at every age • - may have different age-specific death rates • - unit of analysis: international vs internal migration
  • 24.
    Who is amigrant? • Legal vs. administrative realities: • U.S.: Non-resident alien with fiscal residency • Regular and irregular internal movement (e.g. China, former Soviet Union) • Repeat and seasonal migration • Proper units of analysis: Boundary-spanning urban clusters • Time: • Distinguishing migration from tourism • Frequent international travel •
  • 25.
    Special migrant categories •Agricultural workers (seasonal) • Highly-skilled migrants: • Health-care professionals • Knowledge workers • Undocumented migrants • Asylum seekers • Environmentally-displaced people • Human trafficking • Retirees • Super-commuters • What do these all have in common?
  • 26.
    Measuring migrations • Hard-to-measurepopulations • Legal high stakes – measurement can have consequences, e.g. undocumented migration, human trafficking • Fuzzy definitions: highly-skilled, super-commuters • Hard-to-reach: universal • No uniform definition: • Minimum 90 days stay vs. 183 days in a year? • Contiguous vs. non-contiguous stays • When is a move a move? • Forecasting is highly complex
  • 27.
    Data Sources forDemographic Research • Population censuses • Population register • Small-sample surveys
  • 29.
    Population Censuses • Interview-basedapproach, door-to-door • Snapshot state of population – “dimension table” • 190 countries conducting censuses during 2000 round (UNSTATS) • Political implications: • - Redistricting • - Power-sharing agreements • - Vulnerable social groups
  • 30.
    Population Censuses: DataReliability • Politics => potential data reliability issues: • - Missing censuses (e.g. Lebanon – see Soffer, 1986) • - Missing questions (e.g. sexual orientation, religion, etc.) • - e.g. England and Wales census first asking about religion in 2001, see Southworth, 2005) • - in the U.S. census, “Prior to 1990, couples living outside of marriage in marriage-like relationships were not identified separately from individuals living together as roommates.” (Black et al., 1999) • - Missing people (migrants, LGBT, religious and ethnic minorities) • - e.g. Fenton et al., 2001; Goldstein and Morning, 2000
  • 31.
    Population Registers • Theory: •- national identification scheme, unique personal ID • - use new events to update individuals’ state • Practice: • - Missing data! - Administrative registration not same as real event - But, it’s possible! E.g., “Subjects were randomly selected from 10 year age and sex strata in the entire Swedish population using the continuously updated computerised population register, and were frequency matched so that their age and sex distributions resembled those of the cancer cases” (Lagergren, Bergström, Nyrén, 2000)
  • 32.
    Small-sample surveys - Canonicalexample: American Community Survey - Many subject-specific surveys that can be used to derive demographic quantities. - Ranging from pure public health to pure sociology - E.g. National Health and Nutrition Examination Survey, National Epidemiologic Survey on Alcohol and Related Conditions US. - Compare: ADD Health, National Time-Use Survey, General Social Survey (U.S.) - Of particular relevance: DHS studies
  • 33.
    DHS Surveys • “TheDemographic and Health Surveys (DHS) Program has collected, analyzed, and disseminated accurate and representative data on population, health, HIV, and nutrition through more than 300 surveys in over 90 countries.” See more at: http://dhsprogram.com/#sthash.F0V yJww6.dpuf
  • 34.
    Dissemination mechanisms - Privacypreservation: how to do it? - Microdata - Census data tables - Reports - Maps
  • 35.
    Small-sample surveys: DataReliability - Much lower statistical power - Issue of convenience sampling (see: Framingham Heart Study) - Often not as consistent as consistently executed as a census.
  • 36.
    Modelling approaches • Populationlevel: • Event count models • Age-period-cohort models • (Time-series models) • • Individual level: • Time-variant vs. invariant covariates • Repeatable vs. non-repeatable event (marriage vs. death) • Linear model • Panel dataset: clustered time-series • Survival analysis How to deal with time?
  • 37.
    Linear models - Standardsocial science method. - In a cross-sectional dataset, who experiences a non-repeatable event? (e.g. death) - Can use logistic regression, probit, etc. - Can also use ordinary least squares regression for Gaussian response variables, e.g. income. - Stuff gets complicated fast: WLS, GLS, FGLS, etc.
  • 38.
    Event-count models - Whatto do about repeatable events? - Winkelmann and Zimmerman (1994) – for modelling, can use: - linear model - Tobit model (DV is positive) - Ordinary logit / probit - Event count model, e.g. Poisson Regression, Negative Binomial, Generalized Event Count, Zero-Inflated Poisson
  • 39.
    Age-Period-Cohort Models - Setting:(year_of_observation, age, death_rate) - Effects: - Age: how much time elapsed from birth until time of observation? (e.g. death rates much higher above 85 years of age) - Period: effect specific to the time of observation (e.g. death rates much higher during a war) - Cohort: effect specific to the time of birth (e.g. Eastern European 1986 birth cohort) - So why not just regress: Death_rate ~ Age + Period + Cohort?
  • 40.
    Age-Period-Cohort Models - Yang,Fu and Land (2004): “the identification problem [in APC analysis] still remains largely unsolved.” - Two solutions: - GLM models, w/ either logarithmic or logit link function. Can be reparameterized as fixed effects models. - Intrinsic estimator based on SVD - For microdata, can use mixed effects(Yang and Land, 2006)! - Hacky solution: lme4: death_rate ~ age + age^2 + (1 | period) + (1 | cohort)
  • 41.
  • 42.
    Panel Models - Followingthe same individual unit over time, repeated measurements - Time-varying vs. time-invariant covariates - Different specifications: - Fixed effects – ”within-subjects”: e.g., remove the mean, remove unobserved time-invariant covariates. - Random effects – “between-subjects” comparison, valid if no unobserved covariates. - Difference-in-Differences: observe effect of change in covariates on change in DV. - PVAR: panel vector auto-regression: modelling feedback loops. (Love and Zicchino, 2004)
  • 43.
    Survival Analysis - Censoreddata: - Left truncation: because of when data collection was started - Right censoring: can’t tell the future! • - Kaplan Meier estimates • - Cox proportional hazards - Accelerated Failure Time models - http://data.princeton.edu/pop509/ParametricSurvival.pdf
  • 44.
    Kaplan-Meier Estimates - Estimatean empirical survival function - Estimate variance under certain assumptions ^ source: wikipedia https://rstudio-pubs-static.s3.amazonaws.com/5588_72eb65bfbe0a4cb7b655d2eee0751
  • 45.
    Cox Proportional HazardsModel • - Modelling hazard curves, rather than instantaneous probabilities - Key: proportional hazards assumption – shift baseline function on y axis. - Underlying hazard distribution: Weibull, Gompertz, Gamma, log-normal, mixture - Can even estimate empirical underlying hazard function - Accommodates time-variant covariates - R package: survival (coxph) - https://www.r-bloggers.com/cox-proportional-hazards-model/ - Fox and Weisberg, 2011
  • 46.
    Accelerated Failure TimeModel • T = time to failure • W = error term, e.g. log-normal, log-logistic, Weibull, exponential, extreme-value, Gaussian • For Weibull distribution same as Cox Proportional Hazards model! (Kalbfleisch and Prentice, 2002) • Source: http://data.princeton.edu/pop509/ParametricSurvival.pdf
  • 47.
    Geographic Units • -Problem with national boundaries: not consistent meaning - E.g.: person moving from Luxembourg 30 km into the Netherlands vs person moving from Xi’an to Beijing. - Same applies at subnational level: - E.g., U.S.: states, counties, urban areas, ZIP codes, congressional districts, school districts, etc. - Solution: standardization
  • 48.
    Geographic Units - PUMA(US): Public Use Micro-Data Area - NUTS (Europe) - No worldwide standard
  • 50.
  • 51.
  • 52.
    References • Andersson, A.M., et al. (2008). Adverse trends in male reproductive health: We may have reached a crucial “tipping point.” International Journal of Andrology, 31(2), 74–80. • Black, D., Gates, G., & Sanders, S. (1999). Demographics of the Gay and Lesbian Population in the United States : Evidence from Available Systematic Date Sources. • Crain, D. A., et al. (2008). Female reproductive disorders : the roles of endocrine- disrupting compounds and developmental timing, Fertility and Sterility 90(4).
  • 53.
    References • Easterlin, R.A., Planning, F., & Mar, N. (2007). An Economic Framework for Fertility Analysis, 6(3), 54–63. • Fenton, K. A., Johnson, A. M., Mcmanus, S., Erens, B., & Free, R. (2001). Series editors Measuring sexual behaviour : methodological challenges in survey research, 84–92. • Goldstein, J. R., & Morning, A. J. (2000). The multiple-race population of the United States: Issues and estimates. Proceedings of the National Academy of Sciences, 2000(97), 11.
  • 54.
    References • Lagergren, J.,Bergström, R., & Nyrén, O. (2000). No relation between body mass and gastro-oesophageal reflux symptoms in a Swedish population based study, (December 1994), 26–29. • Land, K. C., & Yang, Y. (2006). A Mixed Models Approach to the Age-Period- Cohort Analysis of Repeated Cross-Section Surveys, with an application to data on trends in verbal test scores. Sociological Methodology, (December 2006). • Lesthaeghe, R. J. (2010). The Unfolding Story of Transition. Population and Development Review, 36(2), 211–251.
  • 55.
    References • Marmot, M.(2005). Social determinants of health inequalities. Lancet, 365(9464), 1099–1104. • Murray, C. J. L., & Lopez, A. D. (1997). Alternative projections of mortality and disability by cause 1990-2020: Global Burden of Disease Study. Lancet, 349(9064), 1498–1504. • Soffer, A. (1986). Lebanon: Where Demography Is the Core of Politics and Life. Middle Eastern Studies, 22(2), 197–205.
  • 56.
    References • Southworth, J.R., & John, W. C. (2005). “Religion” in the 2001 Census for England and Wales, 88, 75–88. • Vanderbloemen, L., Dorling, D., & Minton, J. (2016). Visualising variation in mortality rates across the life course and by sex , USA and comparator states , 1933 – 2010, 826–831. • Weong, B. M., & Je, J. H. (2012). Trends in scale and shape of survival curves. Cientific Reports, 2, 504.
  • 57.
    References • Winkelmann, R.,& Zimmermann, K. F. (1994). Count data models for demographic data. Math Popul Stud, 4(3), 205. • Yang, Y., Fu, W., & Land, K. C. (2004). A Methodological Comparison of Age- Period- Cohort Models : The Intrinsic Estimator and Conventional Generalized Linear Models. Sociological Methodology, (December 2004).