0% found this document useful (0 votes)
24 views49 pages

Mixed Models Day 3 - 2023

MM3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views49 pages

Mixed Models Day 3 - 2023

MM3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 49

Mixed Models Day 3:

Longitudinal Data II
Modelling Time as Categorical
Rebecca Stellato
Longitudinal data (modelling time as categorical)
Overview

• Reisby example, revisited


• Longitudinal data: modelling time as categorical
o the variance-covariance matrix of repeated measures
o correlation structures & covariance pattern models
o covariance patterns of linear mixed models
• Checking assumptions of mixed models
• Three-level models

2
Reisby example, revisited
Hamilton Depression Rating Score, 66 patients, 6 time points

3
Reisby example, revisited
Hamilton Depression Rating Score, 66 patients, 6 time points
> describe(reisby.wide[,3:8])
n mean sd
hdrs.0 61 23.44 4.53
hdrs.1 63 21.84 4.70
hdrs.2 65 18.31 5.49
hdrs.3 65 16.42 6.42
hdrs.4 63 13.62 6.97
hdrs.5 58 11.95 7.22

hdrs.0 hdrs.1 hdrs.2 hdrs.3 hdrs.4 hdrs.5


hdrs.0 1.000 0.493 0.410 0.333 0.227 0.184
hdrs.1 0.493 1.000 0.494 0.412 0.308 0.218
hdrs.2 0.410 0.494 1.000 0.738 0.669 0.461
hdrs.3 0.333 0.412 0.738 1.000 0.817 0.568
hdrs.4 0.227 0.308 0.669 0.817 1.000 0.654
hdrs.5 0.184 0.218 0.461 0.568 0.654 1.000

4
Observed var-cov matrix Reisby dataset
(Combining SD’s and correlations from previous slide)

> round(var(reisby.wide[,3:8], use="pairwise.complete.obs"), digits = 2)

hdrs.0 hdrs.1 hdrs.2 hdrs.3 hdrs.4 hdrs.5


hdrs.0 20.55 10.11 10.14 10.09 7.19 6.28
hdrs.1 10.11 22.07 12.28 12.55 10.26 7.72
hdrs.2 10.14 12.28 30.09 25.13 24.63 18.38
hdrs.3 10.09 12.55 25.13 41.15 37.34 23.99
hdrs.4 7.19 10.26 24.63 37.34 48.59 30.51
hdrs.5 6.28 7.72 18.38 23.99 30.51 52.12

5
Random intercept + random linear time effect
• Yesterday we modelled time as continuous, and (mostly) linear
• What if we think it is unreasonable to use time as linear?
o time continuous: polynomials (yesterday)
o time continuous: splines (beyond the scope of the course)
o time continuous: non-linear models (tomorrow, briefly)

o time “categorical”: when we have a fixed number of time points at


which everyone (theoretically) is measured we can model time as a
factor variable

6
LMM matrix formulation & var-covar matrix
• Recall model linear mixed model:
o 𝑦i𝑗 = 𝛽0 + 𝛽1 𝑋1𝑖𝑗 + ⋯ + 0𝑖 + 1𝑖 𝑋1𝑖𝑗 + 𝜀𝑖𝑗
• In matrix formulation:
o 𝒀𝑖 = 𝑿𝑖 ∙ 𝜷 + 𝒁𝑖 ∙ 𝑖 + 𝜺𝑖
o where 𝑿𝑖 is the covariate matrix of the fixed effect(s) and 𝒁𝑖 the design
matrix of the random effect(s)
• Variance-covariance matrix of repeated measures y:
o 𝑉𝑎𝑟 𝒀𝑖 = 𝒁𝑖 ∙ 𝜮 ∙ 𝒁𝑖 ′ + 𝜎𝑒 2 ∙ 𝑰𝑛𝑖
o variance-covariance matrix of outcome for a patient’s measurements
depends on covariance matrix of random effects 𝜮 and residual
variance 𝜎𝑒2
o the number and variances of the random effects and their correlation(s)
determine part of the variance-covariance matrix

7
LMM matrix formulation & var-covar matrix
𝒀𝒊 = 𝑿𝒊 ∙ 𝜷 + 𝒁𝒊 ∙ 𝒊 + 𝜺𝒊
𝑦𝑖1 1 𝑡𝑖1 1 𝑡𝑖1 𝜀𝑖1
𝑦𝑖2 1 𝑡𝑖2 1 𝑡𝑖2 𝜀𝑖2
𝑦𝑖3 1 𝑡𝑖3 𝛽0 1 𝑡𝑖3 𝜐0𝑖 𝜀𝑖3
𝑦𝑖4 = ∙ + ∙ + 𝜀
1 𝑡𝑖4 𝛽1 1 𝑡𝑖4 𝜐1𝑖 𝑖4
𝑦𝑖5 1 𝑡𝑖5 1 𝑡𝑖5 𝜀𝑖5
𝑦𝑖6 1 𝑡𝑖6 1 𝑡𝑖6 𝜀𝑖6
𝑦𝑖1 𝛽0 + 𝛽1 ∙ 𝑡𝑖1 𝜐0𝑖 + 𝜐1𝑖 ∙ 𝑡𝑖1 𝜀𝑖1
𝑦𝑖2 𝛽0 + 𝛽1 ∙ 𝑡𝑖2 𝜐0𝑖 + 𝜐1𝑖 ∙ 𝑡𝑖2 𝜀𝑖2
𝑦𝑖3 𝛽0 + 𝛽1 ∙ 𝑡𝑖3 𝜐0𝑖 + 𝜐1𝑖 ∙ 𝑡𝑖3 𝜀𝑖3
𝑦𝑖4 = 𝛽0 + 𝛽1 ∙ 𝑡𝑖4 + 𝜐0𝑖 + 𝜐1𝑖 ∙ 𝑡𝑖4 + 𝜀𝑖4
𝑦𝑖5 𝛽0 + 𝛽1 ∙ 𝑡𝑖5 𝜐0𝑖 + 𝜐1𝑖 ∙ 𝑡𝑖5 𝜀𝑖5
𝑦𝑖6 𝛽0 + 𝛽1 ∙ 𝑡𝑖6 𝜐0𝑖 + 𝜐1𝑖 ∙ 𝑡𝑖6 𝜀𝑖6

8
LMM matrix formulation & var-covar matrix
𝑉𝑎𝑟 𝒀𝑖 = 𝒁𝑖 ∙ 𝜮 ∙ 𝒁𝑖 ′ + 𝜎𝑒 2 ∙ 𝑰𝑛𝑖

𝒔𝒐 𝒎𝒆 𝒕𝒉 𝒊 𝒏 𝒈 𝟏 𝟎 𝟎 𝟎 𝟎 𝟎
𝒄𝒐 𝒎𝒑 𝒍𝒊 𝒄𝒂 𝒕𝒆 𝒅 𝟎 𝟏 𝟎 𝟎 𝟎 𝟎
(𝒅 𝒆 𝒑 𝒆 𝒏 𝒅𝒔 𝟎 𝟎 𝟏 𝟎 𝟎 𝟎
= + 𝜎𝑒2 ∙
𝒐 𝒏 𝟎 𝟎 𝟎 𝟏 𝟎 𝟎
𝒓 𝒂 𝒏 𝒅 𝒐 𝒎 𝟎 𝟎 𝟎 𝟎 𝟏 𝟎
𝒆 𝒇𝒇 𝒆 𝒄 𝒕 𝒔) 𝟎 𝟎 𝟎 𝟎 𝟎 𝟏

9
CPMs & var-covar matrix
Another possibility for modelling longitudinal data:

• 𝒀𝑖 = 𝑿𝑖 ∙ 𝜷 + 𝜺𝑖
o No random effects!
o How do we take into account the correlation between measurements
on same person?
• Variance-covariance matrix of repeated measures y is now:
o 𝑉𝑎𝑟 𝒀𝑖 = 𝑉𝑎𝑟 𝜺𝑖 = 𝜮
o and we’d better make it sufficiently complicated!

10
CPMs & var-covar matrix
• We know the residuals are not independent, so we need to assume
correlation for Σ:

𝒕 𝒉 𝒊 𝒔
𝒔 𝒉 𝒐 𝒖 𝒍 𝒅
𝒃 𝒆 𝒔 𝒖 𝒇𝒇 −
𝜮=
𝒊 𝒄 𝒊𝒆 𝒏 𝒕 𝒍𝒚
𝒄 𝒐 𝒎 𝒑 𝒍 −
𝒊 𝒄 𝒂 𝒕 𝒆 𝒅!

• We can use different correlation structures to model Σ directly


• We call this type of model a covariance pattern model (CPM)

11
Example: Reisby Data
Possibilities for modelling correlated measures

• Model correlation of measurements implicitly (linear mixed effects


model):
o repeated observations are level-1 variables nested within patient
(=level 2)
• random intercept per patient or random intercept per patient + random
slope for time per patient or...
• random effects and their covariance determine structure of var-covar matrix
• Model correlation of measurements explicitly (CPM):
o incorporate a covariance structure of the residuals into the model
o (usually) assumes equally spaced time intervals
• Combination of the two (mixed regression models with
autocorrelated errors)
12
Various correlation structures
• Both R and SPSS have numerous correlation structures for linear
mixed models. We will discuss four of these:
o independent or scaled identity
o exchangeable or compound symmetry
o unstructured
o autoregressive of order 1: AR(1)
• with both homogeneous and heterogeneous variances
• In the next slides we apply those 4 correlation structures and
examine only the modelled variance-covariance matrix (summaries
of the models with fixed effects are ignored for now)

• Correlations pertain to the residuals within each of the subjects after


correcting for fixed (and perhaps random) effects

13
Independent correlation structure
• The independent (scaled identity) correlation structure assumes
residuals to be independent, as if they came from different subjects
• All variances are assumed equal, all correlations are assumed 0
• This is the assumption in ordinary linear regression/ANOVA

𝝈𝟐 𝟎 𝟎 𝟎 𝟎 𝟎 𝟏 𝟎 𝟎 𝟎 𝟎 𝟎
𝟎 𝝈𝟐 𝟎 𝟎 𝟎 𝟎 𝟎 𝟏 𝟎 𝟎 𝟎 𝟎
Σ= 𝟎 𝟎 𝝈𝟐 𝟎 𝟎 𝟎 = 𝝈𝟐 ∙
𝟎 𝟎 𝟏 𝟎 𝟎 𝟎
𝟎 𝟎 𝟎 𝝈𝟐 𝟎 𝟎 𝟎 𝟎 𝟎 𝟏 𝟎 𝟎
𝟎 𝟎 𝟎 𝟎 𝝈𝟐 𝟎 𝟎 𝟎 𝟎 𝟎 𝟏 𝟎
𝟎 𝟎 𝟎 𝟎 𝟎 𝝈𝟐 𝟎 𝟎 𝟎 𝟎 𝟎 𝟏

14
Independent correlation structure
• Analyzing the data from our example using an independent
correlation structure amounts to doing a two-way ANOVA (all
observations are assumed to be independent) with time and
treatment as factors
• Even when observations are in fact (nearly) independent, the design
of the study was to take random patients, and measure these
multiple times, not to take random samples at each time point

➢ Preferable to analyze data as being repeated (and thus correlated)!

15
Compound symmetry correlation structure
• The compound symmetry (exchangeable) correlation structure
assumes correlations between all time points to be equal,
irrespective of the length of the time intervals.
• All variances are assumed equal, all correlations too:

𝜎 2 + 𝜎0 2 𝜎0 2 𝜎0 2 𝜎0 2 𝜎0 2 𝜎0 2 1 𝜌 𝜌 𝜌 𝜌 𝜌
𝜎0 2 𝜎 2 + 𝜎0 2 𝜎0 2 𝜎0 2 𝜎0 2 𝜎0 2 𝜌 1 𝜌 𝜌 𝜌 𝜌
𝜎0 2 𝜎2 𝜎 2 + 𝜎0 2 𝜎0 2 𝜎0 2 𝜎0 2 𝜌 𝜌 1 𝜌 𝜌 𝜌
= 𝜎0 2 + 𝜎 2
𝜎0 2 𝜎0 2 𝜎0 2 𝜎 2 + 𝜎0 2 𝜎0 2 𝜎0 2 𝜌 𝜌 𝜌 1 𝜌 𝜌
𝜎0 2 𝜎0 2 𝜎0 2 𝜎0 2 𝜎 2 + 𝜎0 2 𝜎0 2 𝜌 𝜌 𝜌 𝜌 1 𝜌
𝜎0 2 𝜎0 2 𝜎0 2 𝜎0 2 𝜎0 2 𝜎 2 + 𝜎0 2 𝜌 𝜌 𝜌 𝜌 𝜌 1

𝜎0 2
• Note: 𝜌 = ൗ 𝜎0 2 +𝜎2 , with 𝜎0 2 the variance between patients
• ρ is then known as the intraclass correlation coefficient: proportion
total variance accounted for by clustering w/in individuals (level-2
units) ≈ corr among obs (level-1 units) w/in clusters (level-2 units)
16
Compound symmetry correlation structure
• A covariance pattern model with a compound symmetry pattern for
the residuals is equivalent to a linear mixed model (using time as
categorical) with a random intercept per patient
• In R/SPSS two separate ways to fit this model:
o define a “repeated” variable (time effect) and explicitly ask for a
correlation type: correlation=corCompSymm (R) or Repeated
Covariance Type: compound symmetry (SPSS)
o include a random intercept in the model and uncorrelated residuals:
this results (implicitly) in compound symmetry
• For data without missing outcomes, these two models are also
equivalent to a repeated measures (“split-plot”) ANOVA

17
Covariance pattern model with compound
symmetry correlation structure
> gls.cs <- gls(hdrs ~ week*endo, correlation=corCompSymm(form = ~ 1 | id), data=reisby.long,
na.action="na.omit", method="ML")
> summary(gls.cs)

Correlation Structure: Compound symmetry


Formula: ~1 | id
Parameter estimate(s):
Rho
0.450704

Coefficients:
Value Std.Error t-value p-value
(Intercept) 22.660871 1.114797 20.327349 0.0000
week1 -2.178112 1.167293 -1.865952 0.0629
week2 -5.663177 1.179220 -4.802478 0.0000
week3 -7.316043 1.167293 -6.267532 0.0000
week4 -10.040181 1.167293 -8.601256 0.0000
week5 -11.368453 1.191900 -9.538091 0.0000
endo 1.191178 1.506269 0.790814 0.4296
week1:endo 1.368264 1.593345 0.858737 0.3911
week2:endo 1.108425 1.584465 0.699558 0.4847
week3:endo 0.829647 1.581150 0.524711 0.6001
week4:endo 0.719739 1.593345 0.451716 0.6517
week5:endo 0.120221 1.631796 0.073674 0.9413

Residual standard error: 5.840399


Degrees of freedom: 375 total; 363 residual

18
CPM w/compound symmetry correlation structure
> gls.cs <- gls(hdrs ~ week*endo, correlation=corCompSymm(form = ~ 1 | id),
data=reisby.long, na.action="na.omit", method="ML")
> getVarCov(gls.cs, type="marginal")
Marginal variance covariance matrix
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 34.110 15.374 15.374 15.374 15.374 15.374
[2,] 15.374 34.110 15.374 15.374 15.374 15.374
[3,] 15.374 15.374 34.110 15.374 15.374 15.374
[4,] 15.374 15.374 15.374 34.110 15.374 15.374
[5,] 15.374 15.374 15.374 15.374 34.110 15.374
[6,] 15.374 15.374 15.374 15.374 15.374 34.110
Standard Deviations: 5.8404 5.8404 5.8404 5.8404 5.8404 5.8404

> corMatrix(gls.cs$modelStruct$corStruct)[[1]]
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1.000000 0.450704 0.450704 0.450704 0.450704 0.450704
[2,] 0.450704 1.000000 0.450704 0.450704 0.450704 0.450704
[3,] 0.450704 0.450704 1.000000 0.450704 0.450704 0.450704
[4,] 0.450704 0.450704 0.450704 1.000000 0.450704 0.450704
[5,] 0.450704 0.450704 0.450704 0.450704 1.000000 0.450704
[6,] 0.450704 0.450704 0.450704 0.450704 0.450704 1.000000

𝜎ො0 2 = 15.37; 𝜎ො 2 + 𝜎ො0 2 = 34.11 → 𝜎ො 2 = 18.74; 𝜌ො = 0.45


19
Unstructured correlation
𝜃11 𝜃12 𝜃13 𝜃14 𝜃15 𝜃16
𝜃21 𝜃22 𝜃23 𝜃24 𝜃25 𝜃26
𝜃31 𝜃32 𝜃33 𝜃34 𝜃35 𝜃36
𝜃41 𝜃42 𝜃43 𝜃44 𝜃45 𝜃46
𝜃51 𝜃52 𝜃53 𝜃54 𝜃55 𝜃56
𝜃61 𝜃62 𝜃63 𝜃64 𝜃65 𝜃66

• Variances at each time point different


• All covariances among time points different (note that θ12 = θ21)
• Costly structure: 21 df needed for 6 time points!
• Flexible structure

20
Unstructured correlation in R (cont.)
> gls.unstruct <- gls(hdrs ~ week*endo, correlation=corSymm(form = ~ 1 |
id), weights=varIdent(form=~1|week), data=reisby.long,
na.action="na.omit", method="ML")

> getVarCov(gls.unstruct, individual = "101")


## Marginal variance covariance matrix
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 19.6330 10.3980 6.3449 8.086 6.7260 4.5004
## [2,] 10.3980 20.8660 10.2570 10.692 8.3337 5.0218
## [3,] 6.3449 10.2570 25.9020 22.360 23.9850 20.8840
## [4,] 8.0860 10.6920 22.3600 38.439 32.0320 29.9000
## [5,] 6.7260 8.3337 23.9850 32.032 46.8950 38.7660
## [6,] 4.5004 5.0218 20.8840 29.900 38.7660 60.1860
## Standard Deviations: 4.4309 4.5679 5.0894 6.1999 6.848 7.758

21
Autoregressive of order 1 (AR1) correlation
𝜎2 𝜎 2𝜌 𝜎 2 𝜌2 𝜎 2 𝜌3 𝜎 2 𝜌4 𝜎 2 𝜌5 1 𝜌 𝜌2 𝜌3 𝜌4 𝜌5
𝜎 2𝜌 𝜎2 𝜎 2𝜌 𝜎 2 𝜌2 𝜎 2 𝜌3 𝜎 2 𝜌4 𝜌 1 𝜌 𝜌2 𝜌3 𝜌4
𝜎 2 𝜌2 𝜎 2𝜌 𝜎2 𝜎 2𝜌 𝜎 2 𝜌2 𝜎 2 𝜌3 2 𝜌2 𝜌 1 𝜌 𝜌2 𝜌3
=𝜎 ∙
𝜎 2 𝜌3 𝜎 2 𝜌2 𝜎 2𝜌 𝜎2 𝜎 2𝜌 𝜎 2 𝜌2 𝜌3 𝜌2 𝜌 1 𝜌 𝜌2
𝜎 2 𝜌4 𝜎 2 𝜌3 𝜎 2 𝜌2 𝜎 2𝜌 𝜎2 𝜎 2𝜌 𝜌4 𝜌3 𝜌2 𝜌 1 𝜌
𝜎 2 𝜌5 𝜎 2 𝜌4 𝜎 2 𝜌3 𝜎 2 𝜌2 𝜎 2𝜌 𝜎2 𝜌5 𝜌4 𝜌3 𝜌2 𝜌 1

• Observations per subject assumed to be taken at equally-spaced


intervals
• AR(1) assumes all observations 1 time unit apart have same
correlations
• Observations 2 units apart have corr ρ2, obs 3 units apart ρ3, etc.
• Outcome has same variance (𝜎 2 ) across all time points

22
AR1 with homogeneous variances in R (cont.)
> gls.AR1 <- gls(hdrs ~ week*endo, correlation=corAR1(form = ~ 1 | id),
data=reisby.long, na.action="na.omit", method="ML")
> getVarCov(gls.AR1, type="marginal")
Marginal variance covariance matrix
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 35.0510 23.1050 15.231 10.040 6.6183 4.3627
[2,] 23.1050 35.0510 23.105 15.231 10.0400 6.6183
[3,] 15.2310 23.1050 35.051 23.105 15.2310 10.0400
[4,] 10.0400 15.2310 23.105 35.051 23.1050 15.2310
[5,] 6.6183 10.0400 15.231 23.105 35.0510 23.1050
[6,] 4.3627 6.6183 10.040 15.231 23.1050 35.0510
Standard Deviations: 5.9204 5.9204 5.9204 5.9204 5.9204 5.9204

23
AR1 with heterogeneous variances in R (cont.)
> gls.hAR1 <- gls(hdrs ~ week*endo, correlation=corAR1(form = ~ 1 | id),
weight = varIdent(form = ~ 1 | week), data=reisby.long,
na.action="na.omit", method="ML")

> getVarCov(gls.hAR1, individual = "101")


## Marginal variance covariance matrix
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 22.7480 15.8000 10.634 7.2293 4.9470 3.6599
## [2,] 15.8000 26.8450 18.067 12.2830 8.4055 6.2185
## [3,] 10.6340 18.0670 29.747 20.2240 13.8390 10.2380
## [4,] 7.2293 12.2830 20.224 33.6350 23.0160 17.0280
## [5,] 4.9470 8.4055 13.839 23.0160 38.5290 28.5050
## [6,] 3.6599 6.2185 10.238 17.0280 28.5050 51.5880
## Standard Deviations: 4.7695 5.1812 5.454 5.7996 6.2072 7.1825

Heterogeneous variances probably fit the data better


24
Break & practice
• Ex 1
• Next lecture at 11.45

25
“Covariance patterns” of linear mixed models
• A random intercept model implies a compound symmetry structure
for all data combined
• A linear mixed model with random intercept and random slope also
implies a certain correlation structure for the data, but this is by no
means a simple structure
o structure depends on the actual parameter estimates, but: usually the
variances increase for later time points and correlations decrease when
time points are further apart
o this is exactly what we observed for our data set, so this model might
fit the data quite well

26
Covariance patterns linear mixed models
Time as continuous, random intercept + random slope model:
> getVarCov(lme.ris,type="marginal")

Marginal variance covariance matrix


1 2 3 4 5 6
1 23.8600 10.239 8.838 7.4364 6.0349 4.6334
2 10.2390 23.134 11.590 12.2660 12.9420 13.6170
3 8.8380 11.590 26.562 17.0960 19.8480 22.6010
4 7.4364 12.266 17.096 34.1440 26.7550 31.5840
5 6.0349 12.942 19.848 26.7550 45.8800 40.5680
6 4.6334 13.617 22.601 31.5840 40.5680 61.7700

27
Random intercept / random slope model + AR(1)
• Final refinement to the random structure of the model: after
“correcting for” random intercept & slope (time) per patient, assume
residuals within the patients are correlated via an autocorrelation of
order 1
• For longitudinal data this correlation structure makes sense
o if you’re relatively high at time T you’re expected to be still high at T+1
o score at T+1 depends only on T, not on previous scores.
• AR(1) assumes that observations take place at equally-spaced
intervals, for each subject

28
Random intercept / random slope model + cAR(1)
• The AR(1) correlation structure can be modified in such a way that
time intervals between observations need not be equally long, nor
need they be the same for each subject:
o continuous AR(1), or cAR(1), correlation structure is defined via
𝜌 𝑠 = 𝜌|𝑠|
o where 0 ≤ ρ < 1 and
o s is the number of time units by which 2 observations are separated
• This model can be fitted in R (SPSS has AR(1) but not cAR(1) ):

lme.ris.CAR1 <- lme(fixed = HDRS ~ time*endo, random = ~ time|id,


data=reisby.long.noNA, method = ML,
correlation = corCAR1(form = ~ time | id))

29
Reisby example, comparing correlation structures
• We have tried several logical models to fit the variance-covariance
matrix of the HDRS scores over time...
o CPM with heteregeneous AR(1) correlation structure
o CPM with unstructured correlation structure
o LME with random intercept + slope (for time)
o LME with random intercept + slope + cAR(1) correlation structure
• ...and several less logical models:
o CPM with identity correlation structure
o CPM with compound symmetry/LME with random intercept
o CPM with homogeneous AR(1) correlation structure

30
Reisby example, comparing correlation structures
• Four models with the same fixed structure (endo*week, 12 degrees
of freedom) and different random parts.
• Compare these using ML-based methods (LRT and / or AIC):

Model # cov par -2*logLike AIC


Identity 1 2388.027 2414.027
Comp Symm 2 2277.381 2305.381
Unstructured 21 2183.227 2249.227
AR(1) homogen 2 2221.847 2249.847
AR(1) heterogen 7 2207.462 2245.462
31
Reisby example, comparing correlation structures
• From the comparison of the AIC’s
o Taking dependence into account greatly improves the model fit
o Assuming equal variances and equal correlations is not a good option
for these data
o The parsimonious homog. AR(1) just as good as unstructured
o The AR(1) with heterogeneous variances is best

• We could also have used LRTs (for nested models only), with
corrected p-values:
o Homogeneous vs heterogeneous AR(1) (for instance):
• LRT = 2221.847 -2207.462 = 14.385 with 7-2= 5 df; p = 0.0133/2 = 0.0067
• heterogeneous is significantly better than homogeneous AR(1)

32
Which of the random structures fits best?
• In general: better to think in advance about which structure makes
sense for your data (“data-generating mechanism”) instead of fitting
and testing a lot of different structures
• If exploratory:
o models can be compared using AIC or LRT
o as with all modelling (also for fixed effects): keep in mind that the
“best” model is the best model for this one sample
o if you’ve used 10 steps to get to one model, your p-values and
confidence intervals will be too optimistic

33
Reisby example, comparing correlation structures
• We also compared LMEs with *linear* time trend with one another:
o fixed: time, endo, time*endo
o random: intercept vs. int+slope vs. int+slope+cAR(1)

> anova(lme.ris.CAR1, lme.ris, lme.ril)


Model df AIC BIC logLik Test L.Ratio p-value
lme.ris.CAR1 1 9 2224.838 2260.180 -1103.419
lme.ris 2 8 2230.929 2262.345 -1107.465 1 vs 2 8.09133 0.0044
lme.ril 3 6 2294.137 2317.699 -1141.069 2 vs 3 67.20798 <.0001

Model with rand int+slope+cAR(1) is better than model with rand


int+slope (LRT or AIC)
Model with rand int+slope is better than rand int only (LRT or AIC)

34
Reisby example, comparing correlation structures
• No LRT to compare LMEs with linear time trend to CPMs with
categorical time (differing fixed effects, non-nested models)
• Use AIC:

Model AIC
CPM Unstructured 2249.227
CPM AR(1) heterogen 2245.462
LME rand int + slope 2230.929
LME rand int + slope + cAR(1) 2224.838

Model with rand int+slope+cAR(1) is best according to AIC


(We used model w/o cAR(1) yesterday to reduce fixed effects) &
draw conclusions - oops) 35
Break & practice
• Ex 1 & 2
• Lecture at 13:30

36
Checking assumptions of the model
• Model assumptions:
o linearity (if we use time – or other covariates – as linear)
• check with individual plots, spaghetti plots, residual plots
o normality of residuals
o normality of random intercepts (& slopes, if used)
• these three can be saved and checked using Q-Q plots, boxplots,
histograms
• but: generally not helpful
1. because deviations from normality probably not a big problem for
inference on fixed effects (if your interest is in inference on random
effects, there could be a problem)
2. model ‘inflicts’ normality on the random effects, so normality of the
estimated random effects may partly reflect model assumptions
o independence of residuals (once fixed and random effects are taken
into account)
• as in linear models: keep your fingers crossed!

37
Checking assumptions of the model in R
Diagnostic plots for (level-1) residuals: Q-Q plot of residuals (RI+RS
model)
• Aside from three outliers, no
departures from normality
3

2
Quantiles of standard normal

-1

-2

-3

-2 -1 0 1 2 3

Standardized residuals

38
Checking assumptions of the model in R
Diagnostic plots for (level-1) residuals: residuals vs. fitted

• (We’re hoping for a graph with no


patterns)

This looks a bit problematic: a


3

Standardized residuals

slight trend towards higher


2

residuals with higher fitted values


0

-1
• Problems with linearity and/or

-2
missing covariates?

5 10 15 20 25

Fitted values

39
Checking assumptions of the model in R
Diagnostic plots for (level-1) residuals: residuals vs. fitted by endo
5 10 15 20 25

endo endo

3
Standardized residuals

-1

-2

5 10 15 20 25

Fitted values

The problems we saw in the whole sample are present in both groups

40
Checking assumptions of the model in R
Boxplots residuals per subject
610
609
608
607
606
604
603
507
505
504
502
501
361
360
357
355
354
353
352
351
350
349
348
347
346
345
344
339
338
337
335
334
id

333
331
328
327
322
319
318
316
315
313
312
311
310
309
308
305
304
303
302
123
121
120
118
117
115
114
113
108
107
106
105
104
103
101
-10 -5 0 5 10 15

Residuals

We can use this plot to check for large outliers in residuals per person

41
Checking assumptions of the model in R
Observed vs. fitted per subject
5 15 25

608 609 610


40
20
0
501 502 504 505 507 603 604 606 607
40
20
0
350 351 352 353 354 355 357 360 361
40
20
0
337 338 339 344 345 346 347 348 349
40
20
hdrs

0
318 319 322 327 328 331 333 334 335
40
20
0
305 308 309 310 311 312 313 315 316
40
20
0
115 117 118 120 121 123 302 303 304
40
20
0
101 103 104 105 106 107 108 113 114
40
20
0
5 15 25 5 15 25 5 15 25 5 15 25 5 15 25

Fitted values

This plot can be used to check for individuals with poor agreement
between observed and fitted HDRS scores.
42
Checking assumptions of the model in R

• Quick plot of random


(Intercept) time
intercepts and slopes for time
610
609
608
607
606
604
603
507
505

• (We’re not looking for


504
502
501
361
360
357
355
354
353

patterns here, just for large


352
351
350
349
348
347
346
345

outliers)
344
339
338
337
335
334
id

333
331
328
327

• No obvious outliers, though a


322
319
318
316
315
313
312
311
310

few high-ish slopes a couple


309
308
305
304
303
302
123
121
120

high-ish intercepts
118
117
115
114
113
108
107
106
105
104
103
101
24 25 26 27 28 29 -3 -2 -1 0

Coefficients

43
Statistical conclusions imipramine example
• A model with fixed linear time effect, random intercept and random
slope for time, and cAR(1) correlation for the within subject residuals
seems to provide the “best” fit for these data
• The assumptions of normality for the level-1 and level-2 random
effects seem reasonable
• The assumption of constant variance of residuals (given the random
effects) might be violated

➢ Now we may present our results (with caution)

44
Three-level models
• So far: two levels
o children within schools, patients within hospitals
o measurements within individuals over time
• What about three levels?
o children within classrooms within schools
o longitudinal measurements within patients within hospitals

45
Analyzing three-level models
• Variance at 3 levels
o random effects (which??) at 2 levels
• Variables measured at 3 levels (children w/in classrooms w/in
schools)?
o main effects, interactions
o “cross-level” interactions (SES of school * SES of child, gender of
teacher * gender of child)
• Think carefully about design
o possible sources of variation
o effects at lower level that could possibly differ at higher level
• teacher-level (2) variables (gender, experience) could have different effects
at different schools (3)
• child-level (1) variables (gender, entrance exam score) could have different
effects in different classrooms (2) or at different schools (3)
• Think about research question: simplicity vs generalizability
46
Example three-level data
• Monday lab: multi-center hypertension trial: 27 centers, 193 patients,
4 post-randomization DBP
• Sources of variation:
o centers (level 3):
• may serve different populations, with (on average) higher or lower BP
o patients (level 2):
• patients vary greatly in their blood pressure levels
– age, gender, baseline BMI, treatment
• patients may vary (greatly?) in their trends over time
o measurements in time (level 1):
• BP varies considerably from moment to moment, day to day within
individuals
– stress level, tx adherence, BMI at the moment, ...

47
Example three-level data
• Design:
o randomized trial, so interest in tx & tx*time
o hospital-level variables: none provided
o patient-level variables: treatment (& baseline DBP?)
o measurement-level variables: time, DBP
• Fixed effects:
o (intercept,) time (linear or categorical?), tx, tx*time
o baseline DBP (why?)
• Random effects?
o differences in avg DBP among centers → random intercept per center
o difference in tx effect per center? → random tx per center (tricky!!)
o difference in time trend per center? → random time trend per center
o differences among patients → random intercept (& time trend) per
patient

48
Summary
• We can account for correlation of measurements over time
o explicitly: variance-covariance matrix of residuals (CPMs)
• primarily when everyone (theoretically) measured at same time points
o implicitly: random intercept, random slope for time (LMEs)
• can be used for measurements at same time points but also for differing
time points
o both explicitly & implicitly: LMEs with autocorrelated errors
• Model building better done in protocol
o Otherwise: use LRT or AIC to build random part of model, then to
simplify fixed part of model
• Some model assumptions (linearity, normality of res) can be checked
• 3+ levels also possible (complicated, but possible)

49

You might also like