A gentle introduction to growth curves using SPSS

A gentle introduction to growth
curves
Dr. Sean P. Mackinnon, Dalhousie University

When to use a growth curve
 Growth curves measure patterns of change over time
 Specifically, mean-level changes over time
 Patterns can be linear, quadratic, cubic, etc.
Time 1 Time 2 Time 3
John 10 7 5
Mary 8 5 4
Zoe 7 9 9
Sarah 5 2 1
Bill 2 4 3
MEAN 6.4 5.4 4.4
Mean-Level Change**

Limitations of RM-ANOVA
 Requires a balanced design (i.e., no missing data)
 Requires equal spacing between time points
 Requires independence of observations (not often
possible in longitudinal data)
 Requires homogeneity of variance

Growth Curves overcome these limitations
 Accounts for missing data using a full information
maximum likelihood (FIML) approach
 Does not require equal spacing between time points
(can specify unequal time points, e.g., 1, 2, 5, 7, 10)
 Does not require independence of observations (can
model different types of correlated error structures)
 Is robust to violations of homogeneity of variance
assumptions required by RM-ANOVA

So… what are growth curves?
 Growth curves are a type of mixed (or multilevel)
model
 Simply put, multilevel models are a way of dealing
with clustered data
 For example…

Level 2
Between-Subjects
(2 Participants)
Level 1
Within-Subjects
(6 measurement
occasions)
Participant ID001
(Average)
Participant ID002
(Average)

Growth Curves are Multilevel Models
 All multilevel models (MLMs) partition variance into
their appropriate levels
 E.g., students nested within schools
 Multilevel models also use maximum likelihood
estimation, which is better when there’s missing data
and are more flexible when dealing with real data
 Growth curves are a specific type of MLM where:
 The lowest level of observation is repeated measures
 The predictor variable is TIME

Application to a clinical context
 The RCT is a
common
design
 Growth curves
can be used
instead of
ANOVA
 The time*interv
interaction is
most important
Leiter et al., 2012

How do you do this in SPSS?
 First, you need to convert your data from “WIDE”
format to “LONG” format
Wide Format

Long Format
 (Use the syntax provided in the handout to get this):
Long Format

Coding the Time Variable is Important
 The choices you make for your time variable will
influence your analyses!
 If relationships are linear, need to be equidistant
 1, 2, 3 OR -1, 0, 1, etc.
 If you are expecting a quadratic relationship, need to also
calculate time-squared
 1, 4, 9 OR 1, 0, 1
 Unequal time points
 1 month, 3 month, 12 month
 1, 3, 12

Decision 1: ML vs REML
 Maximum Likelihood Estimation (ML)
vs
 Restricted Maximum Likelihood Estimation (REML)
 REML is generally preferred because it provides
more unbiased estimates
 ML would be preferred if you need to compare
nested models, as REML is not adequate for this

Decision 2: Fixed vs Random
 Random vs. Fixed Slopes & Intercepts
 Random (varying): Allow to vary across people
 Fixed (constant): Force them to be equal across people
 Random vs. Fixed has no single, agreed-upon
definition (Gelman, 2005); I’m presenting a practical
conceptualization
 Fixed (constant) intercepts and slopes are more
parsimonious and less computationally intensive, but
may not be as good a fit to the data. Select the most
parsimonious model that fits the data best.

Random (varying) Intercepts
Random (varying) Slopes
http://www.spss.ch/upload/1126184451_Linear%20Mixed%20Effects%20Modeling%20in%20SPSS.pdf

Random (varying) Intercepts
Fixed (constant) Slopes

Fixed (constant) Intercepts
Random (varying) Slopes

Decision 3: Linear, Quadratic, or Cubic?
 If slopes are allowed to be random (varying), then
you need at least:
 3 time points for linear
 4 time points for quadratic
 Add time*time as a predictor
 5 time points for cubic
 Add time*time and time*time*time as predictors
 One less time point needed if using fixed slopes
 Today, I’m focusing on LINEAR relationships

Decision 4: Covariance Structure
 Is there a predictable pattern to the errors?
 If you are unsure, specify an “unstructured” matrix
 Less parsimony because it lets things freely vary
 AR(1) correlated error structure is also fairly common
 Autoregressive correlated errors, getting smaller as
timepoints get more distant
 You can test multiple models with different plausible
structures, and choose the one that fits the data best

Annotated Syntax
MIXED ASItotal WITH time interv
/METHOD = REML
/FIXED = time interv time*interv | SSTYPE(3)
/RANDOM = INTERCEPT time interv |
SUBJECT(id) COVTYPE(UN)
/PRINT = SOLUTION TESTCOV HISTORY.
*Mixed model, dependant variable
predicted by time and intervention
*Restricted Maximum Likelihood
Estimation (usually better than ML)
*Put all predictors after FIXED.
Indicate interactions by Var1*Var2
*The intercept, and the slopes for
time and interv are random. The
slope for the interaction is fixed
because I omitted it from this part.
*”UN” Specifies an unstructured
covariance matrix (other types are
possible, but require thought)

Annotated Output: Model Comparison
 Use the BIC values to compare nested models (e.g.,
random slopes vs fixed slopes)
 Lower absolute values are better (∆BIC > 4)

Annotated Output: Covariance Parameters
 UN(1,1) = Variance of the Intercept. Significant, so
random intercepts are important to include.
 UN(2,2) = Variance of the slope for time. Non-significant,
which suggests that a more parsimonious model with
fixed slopes for time would fit the data better.

Annotated Output
 Interpret like ANOVA; parameters adjusted for clustering
 Time -> Main effect for time (linear, in this case)
 Interv -> Main effect for intervention
 Time * interv -> 2-way Interaction
 Graphing the interaction is usually important to understand
 Dummy coding (0, 1) intervention helps a LOT

Graphing the interaction
Can graph the interaction
using tools meant for
moderation in linear
regression with this kind of
model
The parameters in the output
are interpreted the same way,
they’re just adjusted so that
you’re accounting for the
clustering due to repeated
measurement and missing
data
http://www.jeremydawson.co.
uk/slopes.htm

A few closing points
 Other software can implement this (e.g., SAS,
Mplus, HLM)
 Non-normal data may be better modeled with
different distributional assumptions (e.g., poisson)
 Modeling of covariance structures may be important,
but can be challenging to figure out
 Some programs (e.g., Mplus) may use a latent
variable approach

Questions? Comments?
Thank you! 
P.S. In the handout I provided, there is some syntax
and instructions which may be helpful!
Email me if you want an electronic copy of the
presentation:
mackinnon.sean@dal.ca

Appendix: Syntax
*Convert data from LONG to WIDE format
SORT CASES BY id time.
CASESTOVARS
/ID=id
/INDEX=time
/GROUPBY=VARIABLE.
*Convert data from WIDE to LONG format
VARSTOCASES
/MAKE ASItotal FROM ASItotal.0 ASItotal.1 ASItotal.2
/INDEX=time(3)
/KEEP=id interv
/NULL=KEEP.

Appendix: Syntax
*Linear Growth Curve with Intervention Group as
Moderator (Random Intercept, Random Slopes)
/METHOD = REML
/RANDOM = INTERCEPT time interv time*interv |
SUBJECT(id) COVTYPE(UN)

Appendix: Syntax
Moderator (Random Intercept, Fixed Slopes)
/METHOD = REML
/RANDOM = INTERCEPT | SUBJECT(id)
COVTYPE(UN)

Appendix: Syntax
Moderator (Fixed Intercept, Random Slopes)
/METHOD = REML
/RANDOM = time interv time*interv | SUBJECT(id)
COVTYPE(UN)

Appendix: Syntax
*Quadratic Growth Curve with Intervention Group as
Moderator (Random Intercept, Fixed Slopes)
COMPUTE quadtime = time*time.
EXECUTE.
/METHOD = REML
/FIXED = time quadtime interv time*interv quadtime*interv |
SSTYPE(3)
/RANDOM = INTERCEPT | SUBJECT(id) COVTYPE(UN)

A gentle introduction to growth curves using SPSS

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to A gentle introduction to growth curves using SPSS

Similar to A gentle introduction to growth curves using SPSS (20)

More from smackinnon

More from smackinnon (6)

Recently uploaded

Recently uploaded (20)

A gentle introduction to growth curves using SPSS