0% found this document useful (0 votes)

24 views49 pages

Mixed Models Day 3 - 2023

MM3

Uploaded by

Gabi Si Florin Jalencu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views49 pages

Mixed Models Day 3 - 2023

MM3

Uploaded by

Gabi Si Florin Jalencu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 49

Mixed Models Day 3:

Longitudinal Data II
Modelling Time as Categorical
Rebecca Stellato
Longitudinal data (modelling time as categorical)
Overview

• Reisby example, revisited

• Longitudinal data: modelling time as categorical
o the variance-covariance matrix of repeated measures
o correlation structures & covariance pattern models
o covariance patterns of linear mixed models
• Checking assumptions of mixed models
• Three-level models

2
Reisby example, revisited
Hamilton Depression Rating Score, 66 patients, 6 time points

3
Reisby example, revisited
Hamilton Depression Rating Score, 66 patients, 6 time points
> describe(reisby.wide[,3:8])
n mean sd
hdrs.0 61 23.44 4.53
hdrs.1 63 21.84 4.70
hdrs.2 65 18.31 5.49
hdrs.3 65 16.42 6.42
hdrs.4 63 13.62 6.97
hdrs.5 58 11.95 7.22

hdrs.0 hdrs.1 hdrs.2 hdrs.3 hdrs.4 hdrs.5

hdrs.0 1.000 0.493 0.410 0.333 0.227 0.184
hdrs.1 0.493 1.000 0.494 0.412 0.308 0.218
hdrs.2 0.410 0.494 1.000 0.738 0.669 0.461
hdrs.3 0.333 0.412 0.738 1.000 0.817 0.568
hdrs.4 0.227 0.308 0.669 0.817 1.000 0.654
hdrs.5 0.184 0.218 0.461 0.568 0.654 1.000

4
Observed var-cov matrix Reisby dataset
(Combining SD’s and correlations from previous slide)

> round(var(reisby.wide[,3:8], use="pairwise.complete.obs"), digits = 2)

hdrs.0 hdrs.1 hdrs.2 hdrs.3 hdrs.4 hdrs.5

hdrs.0 20.55 10.11 10.14 10.09 7.19 6.28
hdrs.1 10.11 22.07 12.28 12.55 10.26 7.72
hdrs.2 10.14 12.28 30.09 25.13 24.63 18.38
hdrs.3 10.09 12.55 25.13 41.15 37.34 23.99
hdrs.4 7.19 10.26 24.63 37.34 48.59 30.51
hdrs.5 6.28 7.72 18.38 23.99 30.51 52.12

5
Random intercept + random linear time effect
• Yesterday we modelled time as continuous, and (mostly) linear
• What if we think it is unreasonable to use time as linear?
o time continuous: polynomials (yesterday)
o time continuous: splines (beyond the scope of the course)
o time continuous: non-linear models (tomorrow, briefly)

o time “categorical”: when we have a fixed number of time points at

which everyone (theoretically) is measured we can model time as a
factor variable

6
LMM matrix formulation & var-covar matrix
• Recall model linear mixed model:
o 𝑦i𝑗 = 𝛽0 + 𝛽1 𝑋1𝑖𝑗 + ⋯ + 0𝑖 + 1𝑖 𝑋1𝑖𝑗 + 𝜀𝑖𝑗
• In matrix formulation:
o 𝒀𝑖 = 𝑿𝑖 ∙ 𝜷 + 𝒁𝑖 ∙ 𝑖 + 𝜺𝑖
o where 𝑿𝑖 is the covariate matrix of the fixed effect(s) and 𝒁𝑖 the design
matrix of the random effect(s)
• Variance-covariance matrix of repeated measures y:
o 𝑉𝑎𝑟 𝒀𝑖 = 𝒁𝑖 ∙ 𝜮 ∙ 𝒁𝑖 ′ + 𝜎𝑒 2 ∙ 𝑰𝑛𝑖
o variance-covariance matrix of outcome for a patient’s measurements
depends on covariance matrix of random effects 𝜮 and residual
variance 𝜎𝑒2
o the number and variances of the random effects and their correlation(s)
determine part of the variance-covariance matrix

7
LMM matrix formulation & var-covar matrix
𝒀𝒊 = 𝑿𝒊 ∙ 𝜷 + 𝒁𝒊 ∙ 𝒊 + 𝜺𝒊
𝑦𝑖1 1 𝑡𝑖1 1 𝑡𝑖1 𝜀𝑖1
𝑦𝑖2 1 𝑡𝑖2 1 𝑡𝑖2 𝜀𝑖2
𝑦𝑖3 1 𝑡𝑖3 𝛽0 1 𝑡𝑖3 𝜐0𝑖 𝜀𝑖3
𝑦𝑖4 = ∙ + ∙ + 𝜀
1 𝑡𝑖4 𝛽1 1 𝑡𝑖4 𝜐1𝑖 𝑖4
𝑦𝑖5 1 𝑡𝑖5 1 𝑡𝑖5 𝜀𝑖5
𝑦𝑖6 1 𝑡𝑖6 1 𝑡𝑖6 𝜀𝑖6
𝑦𝑖1 𝛽0 + 𝛽1 ∙ 𝑡𝑖1 𝜐0𝑖 + 𝜐1𝑖 ∙ 𝑡𝑖1 𝜀𝑖1
𝑦𝑖2 𝛽0 + 𝛽1 ∙ 𝑡𝑖2 𝜐0𝑖 + 𝜐1𝑖 ∙ 𝑡𝑖2 𝜀𝑖2
𝑦𝑖3 𝛽0 + 𝛽1 ∙ 𝑡𝑖3 𝜐0𝑖 + 𝜐1𝑖 ∙ 𝑡𝑖3 𝜀𝑖3
𝑦𝑖4 = 𝛽0 + 𝛽1 ∙ 𝑡𝑖4 + 𝜐0𝑖 + 𝜐1𝑖 ∙ 𝑡𝑖4 + 𝜀𝑖4
𝑦𝑖5 𝛽0 + 𝛽1 ∙ 𝑡𝑖5 𝜐0𝑖 + 𝜐1𝑖 ∙ 𝑡𝑖5 𝜀𝑖5
𝑦𝑖6 𝛽0 + 𝛽1 ∙ 𝑡𝑖6 𝜐0𝑖 + 𝜐1𝑖 ∙ 𝑡𝑖6 𝜀𝑖6

8
LMM matrix formulation & var-covar matrix
𝑉𝑎𝑟 𝒀𝑖 = 𝒁𝑖 ∙ 𝜮 ∙ 𝒁𝑖 ′ + 𝜎𝑒 2 ∙ 𝑰𝑛𝑖

𝒔𝒐 𝒎𝒆 𝒕𝒉 𝒊 𝒏 𝒈 𝟏 𝟎 𝟎 𝟎 𝟎 𝟎
𝒄𝒐 𝒎𝒑 𝒍𝒊 𝒄𝒂 𝒕𝒆 𝒅 𝟎 𝟏 𝟎 𝟎 𝟎 𝟎
(𝒅 𝒆 𝒑 𝒆 𝒏 𝒅𝒔 𝟎 𝟎 𝟏 𝟎 𝟎 𝟎
= + 𝜎𝑒2 ∙
𝒐 𝒏 𝟎 𝟎 𝟎 𝟏 𝟎 𝟎
𝒓 𝒂 𝒏 𝒅 𝒐 𝒎 𝟎 𝟎 𝟎 𝟎 𝟏 𝟎
𝒆 𝒇𝒇 𝒆 𝒄 𝒕 𝒔) 𝟎 𝟎 𝟎 𝟎 𝟎 𝟏

9
CPMs & var-covar matrix
Another possibility for modelling longitudinal data:

• 𝒀𝑖 = 𝑿𝑖 ∙ 𝜷 + 𝜺𝑖
o No random effects!
o How do we take into account the correlation between measurements
on same person?
• Variance-covariance matrix of repeated measures y is now:
o 𝑉𝑎𝑟 𝒀𝑖 = 𝑉𝑎𝑟 𝜺𝑖 = 𝜮
o and we’d better make it sufficiently complicated!

10
CPMs & var-covar matrix
• We know the residuals are not independent, so we need to assume
correlation for Σ:

𝒕 𝒉 𝒊 𝒔
𝒔 𝒉 𝒐 𝒖 𝒍 𝒅
𝒃 𝒆 𝒔 𝒖 𝒇𝒇 −
𝜮=
𝒊 𝒄 𝒊𝒆 𝒏 𝒕 𝒍𝒚
𝒄 𝒐 𝒎 𝒑 𝒍 −
𝒊 𝒄 𝒂 𝒕 𝒆 𝒅!

• We can use different correlation structures to model Σ directly

• We call this type of model a covariance pattern model (CPM)

11
Example: Reisby Data
Possibilities for modelling correlated measures

• Model correlation of measurements implicitly (linear mixed effects

model):
o repeated observations are level-1 variables nested within patient
(=level 2)
• random intercept per patient or random intercept per patient + random
slope for time per patient or...
• random effects and their covariance determine structure of var-covar matrix
• Model correlation of measurements explicitly (CPM):
o incorporate a covariance structure of the residuals into the model
o (usually) assumes equally spaced time intervals
• Combination of the two (mixed regression models with
autocorrelated errors)
12
Various correlation structures
• Both R and SPSS have numerous correlation structures for linear
mixed models. We will discuss four of these:
o independent or scaled identity
o exchangeable or compound symmetry
o unstructured
o autoregressive of order 1: AR(1)
• with both homogeneous and heterogeneous variances
• In the next slides we apply those 4 correlation structures and
examine only the modelled variance-covariance matrix (summaries
of the models with fixed effects are ignored for now)

• Correlations pertain to the residuals within each of the subjects after

correcting for fixed (and perhaps random) effects

13
Independent correlation structure
• The independent (scaled identity) correlation structure assumes
residuals to be independent, as if they came from different subjects
• All variances are assumed equal, all correlations are assumed 0
• This is the assumption in ordinary linear regression/ANOVA

𝝈𝟐 𝟎 𝟎 𝟎 𝟎 𝟎 𝟏 𝟎 𝟎 𝟎 𝟎 𝟎
𝟎 𝝈𝟐 𝟎 𝟎 𝟎 𝟎 𝟎 𝟏 𝟎 𝟎 𝟎 𝟎
Σ= 𝟎 𝟎 𝝈𝟐 𝟎 𝟎 𝟎 = 𝝈𝟐 ∙
𝟎 𝟎 𝟏 𝟎 𝟎 𝟎
𝟎 𝟎 𝟎 𝝈𝟐 𝟎 𝟎 𝟎 𝟎 𝟎 𝟏 𝟎 𝟎
𝟎 𝟎 𝟎 𝟎 𝝈𝟐 𝟎 𝟎 𝟎 𝟎 𝟎 𝟏 𝟎
𝟎 𝟎 𝟎 𝟎 𝟎 𝝈𝟐 𝟎 𝟎 𝟎 𝟎 𝟎 𝟏

14
Independent correlation structure
• Analyzing the data from our example using an independent
correlation structure amounts to doing a two-way ANOVA (all
observations are assumed to be independent) with time and
treatment as factors
• Even when observations are in fact (nearly) independent, the design
of the study was to take random patients, and measure these
multiple times, not to take random samples at each time point

➢ Preferable to analyze data as being repeated (and thus correlated)!

15
Compound symmetry correlation structure
• The compound symmetry (exchangeable) correlation structure
assumes correlations between all time points to be equal,
irrespective of the length of the time intervals.
• All variances are assumed equal, all correlations too:

𝜎 2 + 𝜎0 2 𝜎0 2 𝜎0 2 𝜎0 2 𝜎0 2 𝜎0 2 1 𝜌 𝜌 𝜌 𝜌 𝜌
𝜎0 2 𝜎 2 + 𝜎0 2 𝜎0 2 𝜎0 2 𝜎0 2 𝜎0 2 𝜌 1 𝜌 𝜌 𝜌 𝜌
𝜎0 2 𝜎2 𝜎 2 + 𝜎0 2 𝜎0 2 𝜎0 2 𝜎0 2 𝜌 𝜌 1 𝜌 𝜌 𝜌
= 𝜎0 2 + 𝜎 2
𝜎0 2 𝜎0 2 𝜎0 2 𝜎 2 + 𝜎0 2 𝜎0 2 𝜎0 2 𝜌 𝜌 𝜌 1 𝜌 𝜌
𝜎0 2 𝜎0 2 𝜎0 2 𝜎0 2 𝜎 2 + 𝜎0 2 𝜎0 2 𝜌 𝜌 𝜌 𝜌 1 𝜌
𝜎0 2 𝜎0 2 𝜎0 2 𝜎0 2 𝜎0 2 𝜎 2 + 𝜎0 2 𝜌 𝜌 𝜌 𝜌 𝜌 1

𝜎0 2
• Note: 𝜌 = ൗ 𝜎0 2 +𝜎2 , with 𝜎0 2 the variance between patients
• ρ is then known as the intraclass correlation coefficient: proportion
total variance accounted for by clustering w/in individuals (level-2
units) ≈ corr among obs (level-1 units) w/in clusters (level-2 units)
16
Compound symmetry correlation structure
• A covariance pattern model with a compound symmetry pattern for
the residuals is equivalent to a linear mixed model (using time as
categorical) with a random intercept per patient
• In R/SPSS two separate ways to fit this model:
o define a “repeated” variable (time effect) and explicitly ask for a
correlation type: correlation=corCompSymm (R) or Repeated
Covariance Type: compound symmetry (SPSS)
o include a random intercept in the model and uncorrelated residuals:
this results (implicitly) in compound symmetry
• For data without missing outcomes, these two models are also
equivalent to a repeated measures (“split-plot”) ANOVA

17
Covariance pattern model with compound
symmetry correlation structure
> gls.cs <- gls(hdrs ~ week*endo, correlation=corCompSymm(form = ~ 1 | id), data=reisby.long,
na.action="na.omit", method="ML")
> summary(gls.cs)

Correlation Structure: Compound symmetry

Formula: ~1 | id
Parameter estimate(s):
Rho
0.450704

Coefficients:
Value Std.Error t-value p-value
(Intercept) 22.660871 1.114797 20.327349 0.0000
week1 -2.178112 1.167293 -1.865952 0.0629
week2 -5.663177 1.179220 -4.802478 0.0000
week3 -7.316043 1.167293 -6.267532 0.0000
week4 -10.040181 1.167293 -8.601256 0.0000
week5 -11.368453 1.191900 -9.538091 0.0000
endo 1.191178 1.506269 0.790814 0.4296
week1:endo 1.368264 1.593345 0.858737 0.3911
week2:endo 1.108425 1.584465 0.699558 0.4847
week3:endo 0.829647 1.581150 0.524711 0.6001
week4:endo 0.719739 1.593345 0.451716 0.6517
week5:endo 0.120221 1.631796 0.073674 0.9413

Residual standard error: 5.840399

Degrees of freedom: 375 total; 363 residual

18
CPM w/compound symmetry correlation structure
> gls.cs <- gls(hdrs ~ week*endo, correlation=corCompSymm(form = ~ 1 | id),
data=reisby.long, na.action="na.omit", method="ML")
> getVarCov(gls.cs, type="marginal")
Marginal variance covariance matrix
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 34.110 15.374 15.374 15.374 15.374 15.374
[2,] 15.374 34.110 15.374 15.374 15.374 15.374
[3,] 15.374 15.374 34.110 15.374 15.374 15.374
[4,] 15.374 15.374 15.374 34.110 15.374 15.374
[5,] 15.374 15.374 15.374 15.374 34.110 15.374
[6,] 15.374 15.374 15.374 15.374 15.374 34.110
Standard Deviations: 5.8404 5.8404 5.8404 5.8404 5.8404 5.8404

> corMatrix(gls.cs$modelStruct$corStruct)[[1]]
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1.000000 0.450704 0.450704 0.450704 0.450704 0.450704
[2,] 0.450704 1.000000 0.450704 0.450704 0.450704 0.450704
[3,] 0.450704 0.450704 1.000000 0.450704 0.450704 0.450704
[4,] 0.450704 0.450704 0.450704 1.000000 0.450704 0.450704
[5,] 0.450704 0.450704 0.450704 0.450704 1.000000 0.450704
[6,] 0.450704 0.450704 0.450704 0.450704 0.450704 1.000000

𝜎ො0 2 = 15.37; 𝜎ො 2 + 𝜎ො0 2 = 34.11 → 𝜎ො 2 = 18.74; 𝜌ො = 0.45

19
Unstructured correlation
𝜃11 𝜃12 𝜃13 𝜃14 𝜃15 𝜃16
𝜃21 𝜃22 𝜃23 𝜃24 𝜃25 𝜃26
𝜃31 𝜃32 𝜃33 𝜃34 𝜃35 𝜃36
𝜃41 𝜃42 𝜃43 𝜃44 𝜃45 𝜃46
𝜃51 𝜃52 𝜃53 𝜃54 𝜃55 𝜃56
𝜃61 𝜃62 𝜃63 𝜃64 𝜃65 𝜃66

• Variances at each time point different

• All covariances among time points different (note that θ12 = θ21)
• Costly structure: 21 df needed for 6 time points!
• Flexible structure

20
Unstructured correlation in R (cont.)
> gls.unstruct <- gls(hdrs ~ week*endo, correlation=corSymm(form = ~ 1 |
id), weights=varIdent(form=~1|week), data=reisby.long,
na.action="na.omit", method="ML")

> getVarCov(gls.unstruct, individual = "101")

## Marginal variance covariance matrix
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 19.6330 10.3980 6.3449 8.086 6.7260 4.5004
## [2,] 10.3980 20.8660 10.2570 10.692 8.3337 5.0218
## [3,] 6.3449 10.2570 25.9020 22.360 23.9850 20.8840
## [4,] 8.0860 10.6920 22.3600 38.439 32.0320 29.9000
## [5,] 6.7260 8.3337 23.9850 32.032 46.8950 38.7660
## [6,] 4.5004 5.0218 20.8840 29.900 38.7660 60.1860
## Standard Deviations: 4.4309 4.5679 5.0894 6.1999 6.848 7.758

21
Autoregressive of order 1 (AR1) correlation
𝜎2 𝜎 2𝜌 𝜎 2 𝜌2 𝜎 2 𝜌3 𝜎 2 𝜌4 𝜎 2 𝜌5 1 𝜌 𝜌2 𝜌3 𝜌4 𝜌5
𝜎 2𝜌 𝜎2 𝜎 2𝜌 𝜎 2 𝜌2 𝜎 2 𝜌3 𝜎 2 𝜌4 𝜌 1 𝜌 𝜌2 𝜌3 𝜌4
𝜎 2 𝜌2 𝜎 2𝜌 𝜎2 𝜎 2𝜌 𝜎 2 𝜌2 𝜎 2 𝜌3 2 𝜌2 𝜌 1 𝜌 𝜌2 𝜌3
=𝜎 ∙
𝜎 2 𝜌3 𝜎 2 𝜌2 𝜎 2𝜌 𝜎2 𝜎 2𝜌 𝜎 2 𝜌2 𝜌3 𝜌2 𝜌 1 𝜌 𝜌2
𝜎 2 𝜌4 𝜎 2 𝜌3 𝜎 2 𝜌2 𝜎 2𝜌 𝜎2 𝜎 2𝜌 𝜌4 𝜌3 𝜌2 𝜌 1 𝜌
𝜎 2 𝜌5 𝜎 2 𝜌4 𝜎 2 𝜌3 𝜎 2 𝜌2 𝜎 2𝜌 𝜎2 𝜌5 𝜌4 𝜌3 𝜌2 𝜌 1

• Observations per subject assumed to be taken at equally-spaced

intervals
• AR(1) assumes all observations 1 time unit apart have same
correlations
• Observations 2 units apart have corr ρ2, obs 3 units apart ρ3, etc.
• Outcome has same variance (𝜎 2 ) across all time points

22
AR1 with homogeneous variances in R (cont.)
> gls.AR1 <- gls(hdrs ~ week*endo, correlation=corAR1(form = ~ 1 | id),
data=reisby.long, na.action="na.omit", method="ML")
> getVarCov(gls.AR1, type="marginal")
Marginal variance covariance matrix
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 35.0510 23.1050 15.231 10.040 6.6183 4.3627
[2,] 23.1050 35.0510 23.105 15.231 10.0400 6.6183
[3,] 15.2310 23.1050 35.051 23.105 15.2310 10.0400
[4,] 10.0400 15.2310 23.105 35.051 23.1050 15.2310
[5,] 6.6183 10.0400 15.231 23.105 35.0510 23.1050
[6,] 4.3627 6.6183 10.040 15.231 23.1050 35.0510
Standard Deviations: 5.9204 5.9204 5.9204 5.9204 5.9204 5.9204

23
AR1 with heterogeneous variances in R (cont.)
> gls.hAR1 <- gls(hdrs ~ week*endo, correlation=corAR1(form = ~ 1 | id),
weight = varIdent(form = ~ 1 | week), data=reisby.long,
na.action="na.omit", method="ML")

> getVarCov(gls.hAR1, individual = "101")

## Marginal variance covariance matrix
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 22.7480 15.8000 10.634 7.2293 4.9470 3.6599
## [2,] 15.8000 26.8450 18.067 12.2830 8.4055 6.2185
## [3,] 10.6340 18.0670 29.747 20.2240 13.8390 10.2380
## [4,] 7.2293 12.2830 20.224 33.6350 23.0160 17.0280
## [5,] 4.9470 8.4055 13.839 23.0160 38.5290 28.5050
## [6,] 3.6599 6.2185 10.238 17.0280 28.5050 51.5880
## Standard Deviations: 4.7695 5.1812 5.454 5.7996 6.2072 7.1825

Heterogeneous variances probably fit the data better

24
Break & practice
• Ex 1
• Next lecture at 11.45

25
“Covariance patterns” of linear mixed models
• A random intercept model implies a compound symmetry structure
for all data combined
• A linear mixed model with random intercept and random slope also
implies a certain correlation structure for the data, but this is by no
means a simple structure
o structure depends on the actual parameter estimates, but: usually the
variances increase for later time points and correlations decrease when
time points are further apart
o this is exactly what we observed for our data set, so this model might
fit the data quite well

26
Covariance patterns linear mixed models
Time as continuous, random intercept + random slope model:
> getVarCov(lme.ris,type="marginal")

Marginal variance covariance matrix

1 2 3 4 5 6
1 23.8600 10.239 8.838 7.4364 6.0349 4.6334
2 10.2390 23.134 11.590 12.2660 12.9420 13.6170
3 8.8380 11.590 26.562 17.0960 19.8480 22.6010
4 7.4364 12.266 17.096 34.1440 26.7550 31.5840
5 6.0349 12.942 19.848 26.7550 45.8800 40.5680
6 4.6334 13.617 22.601 31.5840 40.5680 61.7700

27
Random intercept / random slope model + AR(1)
• Final refinement to the random structure of the model: after
“correcting for” random intercept & slope (time) per patient, assume
residuals within the patients are correlated via an autocorrelation of
order 1
• For longitudinal data this correlation structure makes sense
o if you’re relatively high at time T you’re expected to be still high at T+1
o score at T+1 depends only on T, not on previous scores.
• AR(1) assumes that observations take place at equally-spaced
intervals, for each subject

28
Random intercept / random slope model + cAR(1)
• The AR(1) correlation structure can be modified in such a way that
time intervals between observations need not be equally long, nor
need they be the same for each subject:
o continuous AR(1), or cAR(1), correlation structure is defined via
𝜌 𝑠 = 𝜌|𝑠|
o where 0 ≤ ρ < 1 and
o s is the number of time units by which 2 observations are separated
• This model can be fitted in R (SPSS has AR(1) but not cAR(1) ):

lme.ris.CAR1 <- lme(fixed = HDRS ~ time*endo, random = ~ time|id,

data=reisby.long.noNA, method = ML,
correlation = corCAR1(form = ~ time | id))

29
Reisby example, comparing correlation structures
• We have tried several logical models to fit the variance-covariance
matrix of the HDRS scores over time...
o CPM with heteregeneous AR(1) correlation structure
o CPM with unstructured correlation structure
o LME with random intercept + slope (for time)
o LME with random intercept + slope + cAR(1) correlation structure
• ...and several less logical models:
o CPM with identity correlation structure
o CPM with compound symmetry/LME with random intercept
o CPM with homogeneous AR(1) correlation structure

30
Reisby example, comparing correlation structures
• Four models with the same fixed structure (endo*week, 12 degrees
of freedom) and different random parts.
• Compare these using ML-based methods (LRT and / or AIC):

Model # cov par -2*logLike AIC

Identity 1 2388.027 2414.027
Comp Symm 2 2277.381 2305.381
Unstructured 21 2183.227 2249.227
AR(1) homogen 2 2221.847 2249.847
AR(1) heterogen 7 2207.462 2245.462
31
Reisby example, comparing correlation structures
• From the comparison of the AIC’s
o Taking dependence into account greatly improves the model fit
o Assuming equal variances and equal correlations is not a good option
for these data
o The parsimonious homog. AR(1) just as good as unstructured
o The AR(1) with heterogeneous variances is best

• We could also have used LRTs (for nested models only), with
corrected p-values:
o Homogeneous vs heterogeneous AR(1) (for instance):
• LRT = 2221.847 -2207.462 = 14.385 with 7-2= 5 df; p = 0.0133/2 = 0.0067
• heterogeneous is significantly better than homogeneous AR(1)

32
Which of the random structures fits best?
• In general: better to think in advance about which structure makes
sense for your data (“data-generating mechanism”) instead of fitting
and testing a lot of different structures
• If exploratory:
o models can be compared using AIC or LRT
o as with all modelling (also for fixed effects): keep in mind that the
“best” model is the best model for this one sample
o if you’ve used 10 steps to get to one model, your p-values and
confidence intervals will be too optimistic

33
Reisby example, comparing correlation structures
• We also compared LMEs with *linear* time trend with one another:
o fixed: time, endo, time*endo
o random: intercept vs. int+slope vs. int+slope+cAR(1)

> anova(lme.ris.CAR1, lme.ris, lme.ril)

Model df AIC BIC logLik Test L.Ratio p-value
lme.ris.CAR1 1 9 2224.838 2260.180 -1103.419
lme.ris 2 8 2230.929 2262.345 -1107.465 1 vs 2 8.09133 0.0044
lme.ril 3 6 2294.137 2317.699 -1141.069 2 vs 3 67.20798 <.0001

Model with rand int+slope+cAR(1) is better than model with rand

int+slope (LRT or AIC)
Model with rand int+slope is better than rand int only (LRT or AIC)

34
Reisby example, comparing correlation structures
• No LRT to compare LMEs with linear time trend to CPMs with
categorical time (differing fixed effects, non-nested models)
• Use AIC:

Model AIC
CPM Unstructured 2249.227
CPM AR(1) heterogen 2245.462
LME rand int + slope 2230.929
LME rand int + slope + cAR(1) 2224.838

Model with rand int+slope+cAR(1) is best according to AIC

(We used model w/o cAR(1) yesterday to reduce fixed effects) &
draw conclusions - oops) 35
Break & practice
• Ex 1 & 2
• Lecture at 13:30

36
Checking assumptions of the model
• Model assumptions:
o linearity (if we use time – or other covariates – as linear)
• check with individual plots, spaghetti plots, residual plots
o normality of residuals
o normality of random intercepts (& slopes, if used)
• these three can be saved and checked using Q-Q plots, boxplots,
histograms
• but: generally not helpful
1. because deviations from normality probably not a big problem for
inference on fixed effects (if your interest is in inference on random
effects, there could be a problem)
2. model ‘inflicts’ normality on the random effects, so normality of the
estimated random effects may partly reflect model assumptions
o independence of residuals (once fixed and random effects are taken
into account)
• as in linear models: keep your fingers crossed!

37
Checking assumptions of the model in R
Diagnostic plots for (level-1) residuals: Q-Q plot of residuals (RI+RS
model)
• Aside from three outliers, no
departures from normality
3

2
Quantiles of standard normal

-1

-2

-3

-2 -1 0 1 2 3

Standardized residuals

38
Checking assumptions of the model in R
Diagnostic plots for (level-1) residuals: residuals vs. fitted

• (We’re hoping for a graph with no

patterns)

This looks a bit problematic: a

3
•
Standardized residuals

slight trend towards higher

residuals with higher fitted values

-1
• Problems with linearity and/or

-2
missing covariates?

5 10 15 20 25

Fitted values

39
Checking assumptions of the model in R
Diagnostic plots for (level-1) residuals: residuals vs. fitted by endo
5 10 15 20 25

endo endo

3
Standardized residuals

-1

-2

5 10 15 20 25

Fitted values

The problems we saw in the whole sample are present in both groups

40
Checking assumptions of the model in R
Boxplots residuals per subject
610
609
608
607
606
604
603
507
505
504
502
501
361
360
357
355
354
353
352
351
350
349
348
347
346
345
344
339
338
337
335
334
id

333
331
328
327
322
319
318
316
315
313
312
311
310
309
308
305
304
303
302
123
121
120
118
117
115
114
113
108
107
106
105
104
103
101
-10 -5 0 5 10 15

Residuals

We can use this plot to check for large outliers in residuals per person

41
Checking assumptions of the model in R
Observed vs. fitted per subject
5 15 25

608 609 610

40
20
0
501 502 504 505 507 603 604 606 607
40
20
0
350 351 352 353 354 355 357 360 361
40
20
0
337 338 339 344 345 346 347 348 349
40
20
hdrs

0
318 319 322 327 328 331 333 334 335
40
20
0
305 308 309 310 311 312 313 315 316
40
20
0
115 117 118 120 121 123 302 303 304
40
20
0
101 103 104 105 106 107 108 113 114
40
20
0
5 15 25 5 15 25 5 15 25 5 15 25 5 15 25

Fitted values

This plot can be used to check for individuals with poor agreement
between observed and fitted HDRS scores.
42
Checking assumptions of the model in R

• Quick plot of random

(Intercept) time
intercepts and slopes for time
610
609
608
607
606
604
603
507
505

• (We’re not looking for

504
502
501
361
360
357
355
354
353

patterns here, just for large

352
351
350
349
348
347
346
345

outliers)
344
339
338
337
335
334
id

333
331
328
327

• No obvious outliers, though a

322
319
318
316
315
313
312
311
310

few high-ish slopes a couple

309
308
305
304
303
302
123
121
120

high-ish intercepts
118
117
115
114
113
108
107
106
105
104
103
101
24 25 26 27 28 29 -3 -2 -1 0

Coefficients

43
Statistical conclusions imipramine example
• A model with fixed linear time effect, random intercept and random
slope for time, and cAR(1) correlation for the within subject residuals
seems to provide the “best” fit for these data
• The assumptions of normality for the level-1 and level-2 random
effects seem reasonable
• The assumption of constant variance of residuals (given the random
effects) might be violated

➢ Now we may present our results (with caution)

44
Three-level models
• So far: two levels
o children within schools, patients within hospitals
o measurements within individuals over time
• What about three levels?
o children within classrooms within schools
o longitudinal measurements within patients within hospitals

45
Analyzing three-level models
• Variance at 3 levels
o random effects (which??) at 2 levels
• Variables measured at 3 levels (children w/in classrooms w/in
schools)?
o main effects, interactions
o “cross-level” interactions (SES of school * SES of child, gender of
teacher * gender of child)
• Think carefully about design
o possible sources of variation
o effects at lower level that could possibly differ at higher level
• teacher-level (2) variables (gender, experience) could have different effects
at different schools (3)
• child-level (1) variables (gender, entrance exam score) could have different
effects in different classrooms (2) or at different schools (3)
• Think about research question: simplicity vs generalizability
46
Example three-level data
• Monday lab: multi-center hypertension trial: 27 centers, 193 patients,
4 post-randomization DBP
• Sources of variation:
o centers (level 3):
• may serve different populations, with (on average) higher or lower BP
o patients (level 2):
• patients vary greatly in their blood pressure levels
– age, gender, baseline BMI, treatment
• patients may vary (greatly?) in their trends over time
o measurements in time (level 1):
• BP varies considerably from moment to moment, day to day within
individuals
– stress level, tx adherence, BMI at the moment, ...

47
Example three-level data
• Design:
o randomized trial, so interest in tx & tx*time
o hospital-level variables: none provided
o patient-level variables: treatment (& baseline DBP?)
o measurement-level variables: time, DBP
• Fixed effects:
o (intercept,) time (linear or categorical?), tx, tx*time
o baseline DBP (why?)
• Random effects?
o differences in avg DBP among centers → random intercept per center
o difference in tx effect per center? → random tx per center (tricky!!)
o difference in time trend per center? → random time trend per center
o differences among patients → random intercept (& time trend) per
patient

48
Summary
• We can account for correlation of measurements over time
o explicitly: variance-covariance matrix of residuals (CPMs)
• primarily when everyone (theoretically) measured at same time points
o implicitly: random intercept, random slope for time (LMEs)
• can be used for measurements at same time points but also for differing
time points
o both explicitly & implicitly: LMEs with autocorrelated errors
• Model building better done in protocol
o Otherwise: use LRT or AIC to build random part of model, then to
simplify fixed part of model
• Some model assumptions (linearity, normality of res) can be checked
• 3+ levels also possible (complicated, but possible)

Advanced Repeated Measures Analysis
No ratings yet
Advanced Repeated Measures Analysis
51 pages
Longitudinal Data Analysis Guide
No ratings yet
Longitudinal Data Analysis Guide
51 pages
GEE Annette
No ratings yet
GEE Annette
40 pages
Day 3 Extra Slides CPMs in SPSS
No ratings yet
Day 3 Extra Slides CPMs in SPSS
23 pages
STATA Guide to Analyzing Longitudinal Data
No ratings yet
STATA Guide to Analyzing Longitudinal Data
98 pages
Section 4
No ratings yet
Section 4
155 pages
Generalized Estimating Equations (Gees)
No ratings yet
Generalized Estimating Equations (Gees)
40 pages
LDA Part I
No ratings yet
LDA Part I
114 pages
Clase 2
No ratings yet
Clase 2
48 pages
40 (Ebook PDF) Longitudinal Data Analysis by Donald Hedeker Download
100% (3)
40 (Ebook PDF) Longitudinal Data Analysis by Donald Hedeker Download
53 pages
(Ebook PDF) Longitudinal Data Analysis by Donald Hedeker Instant Download
100% (4)
(Ebook PDF) Longitudinal Data Analysis by Donald Hedeker Instant Download
44 pages
Module - 4 (R Training) - Basic Stats & Modeling
No ratings yet
Module - 4 (R Training) - Basic Stats & Modeling
15 pages
Lecture 1
No ratings yet
Lecture 1
69 pages
Gee
No ratings yet
Gee
40 pages
(Ebook PDF) Longitudinal Data Analysis by Donald Hedeker PDF Download
No ratings yet
(Ebook PDF) Longitudinal Data Analysis by Donald Hedeker PDF Download
54 pages
HASTS215 - HSTS215 NOTES Chapter1 - 2
No ratings yet
HASTS215 - HSTS215 NOTES Chapter1 - 2
24 pages
Applied Longitudinal Analysis Lecture Notes
No ratings yet
Applied Longitudinal Analysis Lecture Notes
475 pages
R Test Analysis and Reliability
No ratings yet
R Test Analysis and Reliability
9 pages
Analysis of Panel Data 2019
No ratings yet
Analysis of Panel Data 2019
29 pages
6nov regressionII 202425
No ratings yet
6nov regressionII 202425
40 pages
Computer Lab 3 MM
No ratings yet
Computer Lab 3 MM
38 pages
(Ebook PDF) Longitudinal Data Analysis by Donald Hedeker Online Version
100% (3)
(Ebook PDF) Longitudinal Data Analysis by Donald Hedeker Online Version
111 pages
Rat Weight Analysis Using R
No ratings yet
Rat Weight Analysis Using R
6 pages
Introduction To Data Analysis: Professor David Richardson IIT Stuart School of Business
No ratings yet
Introduction To Data Analysis: Professor David Richardson IIT Stuart School of Business
31 pages
Seminar 3
No ratings yet
Seminar 3
69 pages
GLMM
No ratings yet
GLMM
74 pages
R-Unit 5
No ratings yet
R-Unit 5
76 pages
L04 Multivariate EDA Apeters
No ratings yet
L04 Multivariate EDA Apeters
53 pages
CORRELATION AND COVARIANCE in R
100% (1)
CORRELATION AND COVARIANCE in R
24 pages
Maths Lab
No ratings yet
Maths Lab
17 pages
C8203 IRDA Class Support Handbook
No ratings yet
C8203 IRDA Class Support Handbook
53 pages
338457226
No ratings yet
338457226
176 pages
Three Segments: - Overview - Calculation of R - Assumptions
No ratings yet
Three Segments: - Overview - Calculation of R - Assumptions
16 pages
Advanced Quantitative Methods
No ratings yet
Advanced Quantitative Methods
12 pages
R Cheat Sheet
No ratings yet
R Cheat Sheet
9 pages
Belfast Lectures
No ratings yet
Belfast Lectures
88 pages
BM-1, Applied Statistics, Lesson 2: Comparing Two Groups (And One Group)
No ratings yet
BM-1, Applied Statistics, Lesson 2: Comparing Two Groups (And One Group)
39 pages
Chap 5
No ratings yet
Chap 5
67 pages
Unit 3 Assignment DIRECTIONS R spr18
No ratings yet
Unit 3 Assignment DIRECTIONS R spr18
28 pages
One-Way ANOVA (Repeated and Covariate)
No ratings yet
One-Way ANOVA (Repeated and Covariate)
47 pages
Lec 9
No ratings yet
Lec 9
28 pages
Lecture 7
No ratings yet
Lecture 7
65 pages
Lecture 7
No ratings yet
Lecture 7
67 pages
Statistical Methods II
No ratings yet
Statistical Methods II
284 pages
Random Effects Models: Yanez, Spring 2004 1 Lecture Notes XI
No ratings yet
Random Effects Models: Yanez, Spring 2004 1 Lecture Notes XI
14 pages
04 BasicAnalyses
No ratings yet
04 BasicAnalyses
44 pages
Outils 4 2023
No ratings yet
Outils 4 2023
30 pages
Methods Notes
No ratings yet
Methods Notes
9 pages
RegrCorr PDF
No ratings yet
RegrCorr PDF
20 pages
Lect W4m08ab f2023
No ratings yet
Lect W4m08ab f2023
8 pages
Jan 28 - Correlation II
No ratings yet
Jan 28 - Correlation II
59 pages
4-Lecture 04
No ratings yet
4-Lecture 04
34 pages
CS ELEC 4 Finals Module
No ratings yet
CS ELEC 4 Finals Module
57 pages
(Ebook PDF) Longitudinal Data Analysis by Donald Hedeker
No ratings yet
(Ebook PDF) Longitudinal Data Analysis by Donald Hedeker
55 pages
Commands For Data Analysis Using R
No ratings yet
Commands For Data Analysis Using R
11 pages
EDUC/PSY 6600: Unit 3 Homework: Your Name Spring 2018
No ratings yet
EDUC/PSY 6600: Unit 3 Homework: Your Name Spring 2018
73 pages
6 Continuous Data Analysis
No ratings yet
6 Continuous Data Analysis
49 pages
Regression
No ratings yet
Regression
49 pages
Week 13 Descriptive Statistics
No ratings yet
Week 13 Descriptive Statistics
10 pages
Battery Protection Analog Front End (AFE) IC: Description
No ratings yet
Battery Protection Analog Front End (AFE) IC: Description
24 pages
Sbs 1.1-Compliant Gas Gauge Enabled With Impedance Track™ Technology For Use With The Bq29330
No ratings yet
Sbs 1.1-Compliant Gas Gauge Enabled With Impedance Track™ Technology For Use With The Bq29330
28 pages
Configuring Resilient Ethernet Protocol: Information About Configuring REP
No ratings yet
Configuring Resilient Ethernet Protocol: Information About Configuring REP
12 pages
c1101 4pltep Datasheet
No ratings yet
c1101 4pltep Datasheet
5 pages
Battery Protection Analog Front End (AFE) IC: Description
No ratings yet
Battery Protection Analog Front End (AFE) IC: Description
24 pages
Hurom HU-500 DG Manual English
100% (1)
Hurom HU-500 DG Manual English
28 pages
bq20z90EVM-001 SBS 1.1 Impedance Track™ Technology Enabled Battery Management Solution Evaluation Module
No ratings yet
bq20z90EVM-001 SBS 1.1 Impedance Track™ Technology Enabled Battery Management Solution Evaluation Module
17 pages
Prerequisites For Dynamic Multipoint VPN
No ratings yet
Prerequisites For Dynamic Multipoint VPN
46 pages
Philips Hd8651
No ratings yet
Philips Hd8651
78 pages
0611 DSR DS en
No ratings yet
0611 DSR DS en
2 pages
Compact 5630 Led
No ratings yet
Compact 5630 Led
2 pages
Advanced IPSec
No ratings yet
Advanced IPSec
138 pages
Serial Port Level Converter: C4 220pF R12 1.5K
No ratings yet
Serial Port Level Converter: C4 220pF R12 1.5K
1 page
(5.1) Dynamic Models
No ratings yet
(5.1) Dynamic Models
18 pages
Quant Competition 2023
No ratings yet
Quant Competition 2023
12 pages
S13 CH 5
No ratings yet
S13 CH 5
31 pages
Empirical Generalizations About Market Evolution
No ratings yet
Empirical Generalizations About Market Evolution
13 pages
Journal of Energy - 2013 - Homod - Review On The HVAC System Modeling Types and The Shortcomings of Their Application
No ratings yet
Journal of Energy - 2013 - Homod - Review On The HVAC System Modeling Types and The Shortcomings of Their Application
10 pages
Stationary Stochastic Processes Guide
No ratings yet
Stationary Stochastic Processes Guide
24 pages
Chapter 12 Autocorrelation
No ratings yet
Chapter 12 Autocorrelation
84 pages
Newbold, P., Carlson, W.L. and Thorne, B. Statistics For Business and Economics
No ratings yet
Newbold, P., Carlson, W.L. and Thorne, B. Statistics For Business and Economics
11 pages
Time Series Notes9
No ratings yet
Time Series Notes9
32 pages
Stata Var - Intro Introduction To Vector Auto Regression Models
No ratings yet
Stata Var - Intro Introduction To Vector Auto Regression Models
7 pages
Time Series: AR Processes Explained
No ratings yet
Time Series: AR Processes Explained
100 pages
Simulation Input Modeling Guide
No ratings yet
Simulation Input Modeling Guide
63 pages
Conference Schedule Icomathapp 2023
No ratings yet
Conference Schedule Icomathapp 2023
13 pages
Anselin, Luc. (1988) - Spatial Econometrics PDF
No ratings yet
Anselin, Luc. (1988) - Spatial Econometrics PDF
294 pages
ARIMA Procedure Ebook
No ratings yet
ARIMA Procedure Ebook
110 pages
How Useful Is Okun's Law?: by Edward S. Knotek, II
No ratings yet
How Useful Is Okun's Law?: by Edward S. Knotek, II
31 pages
Chapter 6
No ratings yet
Chapter 6
40 pages
Business Forecasting: The Box-Jenkins Method of Forecasting
No ratings yet
Business Forecasting: The Box-Jenkins Method of Forecasting
22 pages
BTech Advanced AI Unit03
No ratings yet
BTech Advanced AI Unit03
109 pages
CT6 Statistical Methods PDF
0% (2)
CT6 Statistical Methods PDF
6 pages
(Ebook) Analysis of Correlated Data With SAS and R by Mohamed M. Shoukri ISBN 9781138197459, 1138197459
No ratings yet
(Ebook) Analysis of Correlated Data With SAS and R by Mohamed M. Shoukri ISBN 9781138197459, 1138197459
65 pages
MCQ
No ratings yet
MCQ
17 pages
Simple Exponential Smoothing
No ratings yet
Simple Exponential Smoothing
32 pages
Dynamic Stochastic General Equilibrium (Dsge) Modelling:Theory and Practice
No ratings yet
Dynamic Stochastic General Equilibrium (Dsge) Modelling:Theory and Practice
34 pages
SAS Time Series Analysis Guide
No ratings yet
SAS Time Series Analysis Guide
34 pages
ARIMA & MCMC Course Outline
No ratings yet
ARIMA & MCMC Course Outline
13 pages
Technical Document Analysis
100% (2)
Technical Document Analysis
79 pages
Recurrent Neural Networks For Prediction Learning Algorithms Architectures and Stability Danilo P Mandic Instant Download
100% (2)
Recurrent Neural Networks For Prediction Learning Algorithms Architectures and Stability Danilo P Mandic Instant Download
77 pages
Lecture 1 Presentationsmall
No ratings yet
Lecture 1 Presentationsmall
5 pages
Spectrum Estimation and Analysis: Dept of BME, KUET Sub: BME 3111 1
No ratings yet
Spectrum Estimation and Analysis: Dept of BME, KUET Sub: BME 3111 1
47 pages

Mixed Models Day 3 - 2023

Uploaded by

Mixed Models Day 3 - 2023

Uploaded by

Mixed Models Day 3:

• Reisby example, revisited

hdrs.0 hdrs.1 hdrs.2 hdrs.3 hdrs.4 hdrs.5

> round(var(reisby.wide[,3:8], use="pairwise.complete.obs"), digits = 2)

hdrs.0 hdrs.1 hdrs.2 hdrs.3 hdrs.4 hdrs.5

o time “categorical”: when we have a fixed number of time points at

• We can use different correlation structures to model Σ directly

• Model correlation of measurements implicitly (linear mixed effects

• Correlations pertain to the residuals within each of the subjects after

➢ Preferable to analyze data as being repeated (and thus correlated)!

Correlation Structure: Compound symmetry

Residual standard error: 5.840399

𝜎ො0 2 = 15.37; 𝜎ො 2 + 𝜎ො0 2 = 34.11 → 𝜎ො 2 = 18.74; 𝜌ො = 0.45

• Variances at each time point different

> getVarCov(gls.unstruct, individual = "101")

• Observations per subject assumed to be taken at equally-spaced

> getVarCov(gls.hAR1, individual = "101")

Heterogeneous variances probably fit the data better

Marginal variance covariance matrix

lme.ris.CAR1 <- lme(fixed = HDRS ~ time*endo, random = ~ time|id,

Model # cov par -2*logLike AIC

> anova(lme.ris.CAR1, lme.ris, lme.ril)

Model with rand int+slope+cAR(1) is better than model with rand

Model with rand int+slope+cAR(1) is best according to AIC

• (We’re hoping for a graph with no

This looks a bit problematic: a

slight trend towards higher

residuals with higher fitted values

608 609 610

• Quick plot of random

• (We’re not looking for

patterns here, just for large

• No obvious outliers, though a

few high-ish slopes a couple

➢ Now we may present our results (with caution)

You might also like