Nonlinear Discrete-time Hazard Models for Entry into Marriage
Upcoming SlideShare
Loading in...5
×
 

Nonlinear Discrete-time Hazard Models for Entry into Marriage

on

  • 1,997 views

 

Statistics

Views

Total Views
1,997
Views on SlideShare
1,447
Embed Views
550

Actions

Likes
0
Downloads
2
Comments
0

2 Embeds 550

http://www.heatherturner.net 511
http://heatherturner.net 39

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Nonlinear Discrete-time Hazard Models for Entry into Marriage Nonlinear Discrete-time Hazard Models for Entry into Marriage Presentation Transcript

  • Nonlinear Discrete-time Hazard Models for Entry into Marriage Heather Turner, Andy Batchelor, David Firth Department of Statistics University of Warwick, UK 8th March 2010Heather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • Motivating Application: The LII Survey The Living in Ireland Surveys were conducted 1994-2001 For five 5-year cohorts of women born between 1950 and 1975 we have the following data year of (first) marriage year and month of birth social class highest level of education attained year highest level of education was attainedHeather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • When do women get married? We can use methods from survival analysis to model the timing of marriage Consider time starting from the legal age of marriage, then the survival time, T is the time until a person marries The time of marriage is recorded to the nearest year, so we will use a discrete-time analysisHeather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • Discrete-time Hazard Models For discrete-time the hazard of marriage occuring at time t is defined as h(t) = P (T = t|T ≥ t) We are interested in the shape of the hazard over the life course and how the hazard is affected by covariatesHeather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • Cox Proportional Odds Model A popular choice is the proportional odds model proposed by Cox (JRSSB, 1972): h(t|xit ) h0 (t) = exp xit β 1 − h(t|xit ) 1 − h0 t where h0 (t) is the baseline hazard Taking logs we obtain logit(h(t|xit )) = logit(h0 (t)) + xit β = lt + xit β semi-parametric - makes no assumption about the shape of the hazard functionHeather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • Episode-splitting A simple way to estimate the proportional odds model is to generate an event history for each observation Pseudo observations are created at each time point from time 0 up to marriage or censoring - this is known as episode-splitting The parameters in the proportional odds model can then be estimated by fitting a logistic regression model to a binary indicator of marriage at each time point (married = 1, unmarried = 0)Heather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • Cox Proportional Odds Model Probability of Marriage 0.08 0.04 0.00 15 19 23 27 31 35 39 43 Age (years)Heather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • Sidenote: interval-censored data A similar model can be obtained by assuming that the data are interval-censored observations of a continuous-time proportional hazards model The coefficients in the model cloglog(h(t|xit )) = lt + xit β are then the coefficients of the proportional hazards model This relationship breaks down however if αt is replaced by a parametric functionHeather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • Blossfeld and Huinink Model Blossfeld and Huinink (Am. J. Sociol., 1991) propose the following parametric baseline logit(h0 (t|ageit )) = l(ageit ) = c + βl log(ageit − 15) + βr log(45 − ageit ) describes the nature of the time dependence fixes the support of the hazard to be 15 to 45 yearsHeather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • BH Model qq q q q q Probability of Marriage q q q 0.08 q q q q q q q 0.04 q q q q q q q q 0.00 qq q qqq 10 20 30 40 50 Age (years)Heather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • Effect of Endpoints 0.12 Hazard support Probability of Marriage 15−45 years 12−75 years 0.08 0.04 0.00 10 20 30 40 50 Age (years)Heather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • Nonlinear Discrete-time Hazard Model An obvious extension of the BH model is to treat the endpoints as parameters l(ageit ) = c + βl log(ageit − αl ) + βr log(αr − ageit ) nonlinear - need to extend available software near-aliasing between parameters - need to reparameteriseHeather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • Developing the Nonlinear Model First analyse using the BH model as a reference Then analyse using the extended model and illustrate near-aliasing Finally analyse using a re-parameterised nonlinear discrete model compare to BH model refine model for the LII dataHeather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • BH Models The BH models can be fitted using the glm function in R. Following the model building strategy of Blossfeld & Huinink (1991), we select a cohort factor a time-varying indicator of educational status (in/out) For the 1970-1974 cohort the conditional odds of marriage are 24% of those for the 1950-1954 cohort For women in education the conditional odds of marriage are 11% of those for women not in educationHeather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • Selected BH Model 0.15 (1949,1954] Probability of Marriage (1954,1959] (1959,1964] 0.10 (1964,1969] (1969,1974] 0.05 0.00 15 20 25 30 35 40 45 Age (years) Deviance = 12073 Residual d.f. = 31001Heather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • Nonlinear Discrete-time Hazard Models The nonlinear discrete-time hazard model is an example of a generalised nonlinear model, which can be fitted using the gnm package in R (Turner and Firth, R News, 2007) parameters estimated by a modified IWLS algorithm certain nonlinear terms inbuilt e.g. Mult, Exp our terms cannot be expressed in terms of these functions, so need to write custom "nonlin" functionHeather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • Custom "nonlin" Function LogExcess <- function(age, side = "left"){ call <- sys.call() constraint <- ifelse(side == "left", min(age) - 1e-5, max(age) + 1e-5) list(predictors = list(beta = ∼1, alpha = ∼1), variables = list(substitute(age)), term = function(predLabels, varLabels) { paste(predLabels[1], " * log(", " -"[side == "right"], varLabels[1], " + ", " -"[side == "left"], constraint, " + exp(", predLabels[2], "))") }, call = as.expression(call)) } class(LogExcess) <- "nonlin"Heather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • Summary of Baseline Model Call: gnm(formula = marriages/lives ~ LogExcess(age, side = "left") + LogExcess(age, side = "right"), family = binomial, data = fulldata, weights = lives, start = c(-20, 3, 0, 3, 0)) Deviance Residuals: Min 1Q Median 3Q Max -0.8098 -0.4441 -0.3224 -0.1528 4.0483 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -118.5395 201.6387 -0.588 0.55661 LogExcess(age, side = "left")beta 3.6928 1.1913 3.100 0.00194 LogExcess(age, side = "left")alpha -0.1432 0.8935 -0.160 0.87267 LogExcess(age, side = "right")beta 24.8623 38.5743 0.645 0.51923 LogExcess(age, side = "right")alpha 4.0247 1.7376 2.316 0.02054 Std. Error is NA where coefficient has been constrained or is unidentified Residual deviance: 12553 on 31004 degrees of freedom AIC: 12748Heather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-timeiterations: 76 Number of Hazard Models for Entry into Marriage
  • Parameter Correlations c βl αl βr αr c 1.00000 βl -0.92563 1.00000 αl -0.80861 0.95844 1.00000 βr -0.99999 0.92688 0.80989 1.00000 αr -0.99833 0.90319 0.77910 0.99808 1.00000Heather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • Example ’Recoil’ Plot 0.12 Probability of Marriage 0.08 0.04 0.00 10 20 30 40 50 AgeHeather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • Example ’Recoil’ Plot 0.12 Probability of Marriage 0.08 0.04 0.00 10 20 30 40 50 AgeHeather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • Example ’Recoil’ Plot 0.12 qq Probability of Marriage q q q q q q 0.08 q q q q q q q 0.04 q q q q q q q q qq qq 0.00 q q 10 20 30 40 50 AgeHeather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • Is Near-aliasing a Problem? Extended model can still be used as baseline hazard logit(h(t|xit )) = l(ageit ) + xit β Near-aliasing will make models harder to fit - particularly with several covariates Not all parameters are interpretableHeather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • Re-parameterizing the Nonlinear Model The nonlinear hazard model can be re-parameterized as follows: ν − αl l(ageit ) = γ − δ (ν − αl ) log ageit − αl αr − ν + δ (αr − ν) log αr − ageitHeather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • Interpretation of Parameters The parameters of the new parameterisation have a more useful interpretation than before: expit(γ) Probability of Marriage αL ν αR Age (years)Heather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • New Parameter Correlations γ ν δ αl αr γ 1.00000 ν 0.12956 1.00000 δ 0.21943 -0.69849 1.00000 αl 0.27236 -0.42848 0.91425 1.00000 αr 0.03231 -0.75428 0.93696 0.77910 1.00000 Table: Correlations between the estimated parameters of the reparameterized baseline model defined in Equation ??Heather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • Recoil Plots for Reparameterised Model 0.12 peak height (γ) peak location (ν) −2.09 → −1.95 25.39 → 28 predictCurve (x) predictCurve (x) 0.08 0.04 0.00 fall off (δ) left endpoint (αL) Probability of Marriage x 0.34 → 0.15 x 14.17 → 15.04 predictCurve (x) predictCurve (x) 0.12 right endpoint (αR) 10 20 30 40 50 x 100.66 → 47.68 x predictCurve (x) 0.08 rep(0, 41) Original Model 0.04 Perturbed Model q Re−fitted Model 0.00 10 20 30 40 50 xAge 10:50Heather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • Analysis with the Reparameterised Model We can now repeat the previous analysis using the nonlinear baseline hazard instead of the BH hazard function The model selection is qualitatively unchanged The residual deviance is reduced by about 20 at the expense of 2 d.f. There is a lot of uncertainty about the right end-point - in the final model it is estimated as 400 years with a large standard error.Heather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • Infinite Right End-point It seems more appropriate to define the baseline hazard in which the right end-point tends to infinity: ν − αl l(ageit ) = γ−δ (ν − αl ) log − ageit − ν ageit − αl Re-fitting the final model with this baseline increases the deviance by a negligible amountHeather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • 0.15 Comparing Models 0.15 (1949,1954] (1949,1954] Probability of Marriage Probability of Marriage (1954,1959] (1954,1959] (1959,1964] (1959,1964] 0.10 0.10 (1964,1969] (1964,1969] (1969,1974] (1969,1974] 0.05 0.05 0.00 0.00 15 20 25 30 35 40 45 15 20 25 30 35 40 45 Age (years) Age (years) Deviance = 12073 Residual d.f. = 31001 Deviance = 12051 Residual d.f. = 31000Heather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • Refining the Model The model building strategy so far has been similar to Blossfeld and Huinink (1991) for comparison Careful consideration of the fit of the model suggests that improvements can be madeHeather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • Final Model with New Baseline 0.15 (1949,1954] Probability of Marriage (1954,1959] (1959,1964] 0.10 (1964,1969] (1969,1974] 0.05 0.00 15 20 25 30 35 40 45 Age (years) Deviance = 12051 Residual d.f. = 31000Heather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • Cohort Effect We can investigate the cohort effect further by replacing the cohort factor by a year-of-birth factor and plotting the resultant effects q q q q q −0.5 0.0 q q q q q q q q q Year−of−birth Effect q q q q q q q −1.5 q q −2.5 q 1955 1960 1965 1970 Year of BirthHeather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • Year-of-birth Effect The plot suggests a more appropriate model θ exp(λ(yrbi − 1950)) Replacing the year-of-birth factor with this nonlinear term reduces the deviance by 19 whilst gaining 2 d.f.Heather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • Checking the Fit The new year-of-birth terms takes account of the effect of this factor on the magnitude of the hazard To check for other effects on the hazard, we can group the data by year of age and cohort then plot the corresponding observed and fitted proportionsHeather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • Fit over Cohorts 0.20 (1949, 1954] q (1955, 1959) (1959, 1964] (5211) (6283) q (6560) q 0.15 q q q q q q q q q q q grpObs[i, ] grpObs[i, ] grpObs[i, ] q q q q qq q q q q 0.10 q q q q q q q q q q q q q q q q qq q 0.05 q q q q q q q q Proportion married q qq q q q q q q q q qq qqq q qq q 0.00 q qq q qq q qq qq q q qq 15 20 25 30 35 40 45 (1965, 1969] (1969, 1974] (6289) as.numeric(colnames(grp)) (6666) as.numeric(colnames(grp)) as.numeric(colnames(grp)) grpObs[i, ] grpObs[i, ] grpObs[i, ] q q q q q q q q q q q q qqq q q q q q q q qq q q qqqq qq qq qq q 15 20 25 30 35 40 45 as.numeric(colnames(grp)) (years) Age as.numeric(colnames(grp)) as.numeric(colnames(grp))Heather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • Fit over Education Levels 0.20 No attainment/primary Lower secondary Upper secondary q (2366) (7900) (11507) 0.15 q q q q q q q q grpObs[i, ] grpObs[i, ] grpObs[i, ] q q q q 0.10 q q q q q q q q qq q q q q q q q q q q q q q q q 0.05 q q q q q q q q Proportion married q q q qq qq q q qqq q q 0.00 q q q qq qq qqqqq qqq q qqqq qqq q q qq qqq 15 20 25 30 35 40 45 College q University q (4829) as.numeric(colnames(grp)) (4407) as.numeric(colnames(grp)) as.numeric(colnames(grp)) q q q q grpObs[i, ] grpObs[i, ] grpObs[i, ] q Observed q q q q q qqq q Model 13 q q (common peak) q q q q q q q q q q Model 14 qq q q (separate peaks) q q q q q q qqqq q qqqqq qq qqqqq q q q q qq 15 20 25 30 35 40 45 as.numeric(colnames(grp)) (years) Age as.numeric(colnames(grp)) as.numeric(colnames(grp))Heather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • Linear Dependence of Peak Location Quantifying the education level by a dynamic measure of years in education ed, we incorporate a linear dependence of peak location on ed: ν0 + ν1 edi − αl l(xit ) = γ − δ (ν0 + ν1 edi − αl ) log ageit − αl +δ {ageit + ν0 + ν1 edi } This results in a non-proportional hazards modelHeather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • Years Post-Education Checking the fit against years post-education: 0.15 q q q lower rate of increase in Proportion married q q 0.10 q q q q first 3 years q post-education 0.05 q qq q q q q q q q q q q q qq q sharp change at 7 years 0.00 q qqqqqqq qqq q q qqqqq post-education −10 0 10 20 30 Years post education outlying pointsHeather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • Early Career Effect The lower rate of increase during the first 3 years post-education may be explained by an early career effect This can be incorporated in the model by including an appropriate indicator variable, significantly reducing the deviance The deviance does not significantly increase when the left endpoint is constrained to 15 yearsHeather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • Effect of Education Peak location varies from 20.78 years (primary education) to 26.89 years (university graduates) 0.20 Education level Primary Probability of marriage 0.15 Lower sec. Upper sec. PLC 0.10 IT University 0.05 0.00 10 20 30 40 50 Age (years)Heather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • Effect of Year-of-birth Peak hazard varies from 0.17 (b. 1950) through 0.15 (b. 1960) to 0.07 (b. 1970) 0.20 Year of Birth 1950 Probability of marriage 0.15 1960 1970 0.10 0.05 0.00 10 20 30 40 50 Age (years)Heather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • Summary Estimating the support of the hazard function improves fit Near-aliasing can occur in nonlinear models, but can be overcome by re-parameterisation Our proposed model has more interpretable parameters, particularly location and magnitude of the maximum hazard can investigate effect of covariates on these features The parametric form does impose some restrictions on the shape of the hazard curveHeather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • References A comprehensive manual is distributed with the package at http://www.cran.r-project.org/package=gnm A working paper on the marriage application is available at www.warwick.ac.uk/go/crism/research/2007Heather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage
  • Acknowledgements The data are from The Economic and Social Research Institute Living in Ireland Survey Microdata File (©Economic and Social Research Institute). We gratefully acknowledge Carmel Hannan for introducing us to this application and providing background on the data.Heather Turner, Andy Batchelor, David Firth University of WarwickNonlinear Discrete-time Hazard Models for Entry into Marriage