0% found this document useful (0 votes)

11 views64 pages

Generalized Linear Models Generalized Estimating Equations Source: Book and Internet

The document discusses Generalized Estimating Equations (GEEs) and their application in analyzing correlated data from longitudinal and clustered studies. It highlights the limitations of traditional repeated measures ANOVA and presents GEEs as a solution for modeling such data while accounting for correlation structures. The document also provides examples, SAS code, and a comparison of different correlation structures used in GEEs.

Uploaded by

smmirazul25

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views64 pages

Generalized Linear Models Generalized Estimating Equations Source: Book and Internet

Uploaded by

smmirazul25

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 64

Generalized Linear Models

Generalized Estimating Equations

Source: Book and internet

Dr. Md Jamal Uddin

Professor

Statistics, SUST, Sylhet

Correlated data
1. Repeated measures: same subjects, same measure, successive
times – expect successive measurements to be correlated

Treatment groups Measurement times

Subjects, i = B
1,…,n

Randomize

2
Correlated data
2. Clustered/multilevel studies
Level 3

Level 2

Level 1

E.g., Level 3: populations

Level 2: age - sex groups
Level 1: blood pressure measurements in sample of people in each
age - sex group
We expect correlations within populations and within age-sex groups due
to genetic, environmental and measurement effects

3
Example with time-dependent,
continuous predictor…
6 patients with depression are given a drug that increases levels of a “happy
chemical” in the brain. At baseline, all 6 patients have similar levels of this
happy chemical and scores >=14 on a depression scale. Researchers measure
depression score and brain-chemical levels at three subsequent time points: at 2
months, 3 months, and 6 months post-baseline.
Here are the data in broad form:

id time1 time2 time3 time4 chem1 chem2 chem3 chem4

1 20 18 15 20 1000 1100 1200 1300

2 22 24 18 22 1000 1000 1005 950
3 14 10 24 10 1000 1999 800 1700
4 38 34 32 34 1000 1100 1150 1100
5 25 29 25 29 1000 1000 1050 1010
6 30 28 26 14 1000 1100 1109 1500

4
Turn the data to long form…

data long4;
set new4;
time=0; score=time1; chem=chem1; output;
time=2; score=time2; chem=chem2; output;
time=3; score=time3; chem=chem3; output;
time=6; score=time4; chem=chem4; output;
run;

5
id time score chem
1 0 20 1000

Data in long1 2 18 1100

1 3 15 1200
form: 1 6 20 1300
2 0 22 1000
2 2 24 1000
2 3 18 1005
2 6 22 950
3 0 14 1000
3 2 10 1999
3 3 24 800
3 6 10 1700
4 0 38 1000
4 2 34 1100
4 3 32 1150
4 6 34 1100
5 0 25 1000
5 2 29 1000
5 3 25 1050
5 6 29 1010
6 0 30 1000
6 2 28 1100
6 3 26 1109
6 6 14 150
Graphically, let’s see what’s going on:
First, by subject.
All 6 subjects at once:
Mean chemical levels compared with mean
depression scores:
How do you analyze these
data?
Using repeated-measures ANOVA?

The only way to force a rANOVA here is…

data forcedanova;
set broad;
avgchem=(chem1+chem2+chem3+chem4)/4;
if avgchem<1100 then group="low";
Gives no
if avgchem>1100 then group="high";
run; significant
proc glm data=forcedanova; results!
class group;
model time1-time4= group/ nouni;
repeated time /summary;
run; quit; 15
Limitations of Repeated
ANOVA/MANOVA
• They assume categorical predictors.
• They do not handle time-dependent covariates
(predictors measured over time).
• They assume everyone is measured at the same time
(time is categorical) and at equally spaced time
intervals.
• Missing data must be imputed.
• They require restrictive assumptions about the
correlation structure.

16
If such conditions are violated

What are the solutions ?

Solutions!!!

• We need more complicated models!

• Models for correlated data
1. Linear mixed model (already we learned last
classes)
2. Generalized estimating equations
3. GLMM
Generalized Estimating Equations (GEEs)

These are used to model correlated data from

• Longitudinal/ repeated measures studies
• Clustered/ multilevel studies

19
But first…naïve analysis…
• The data in long form could be naively thrown into
an ordinary least squares (OLS) linear regression…
• I.e., look for a linear correlation between chemical
levels and depression scores ignoring the
correlation between subjects. (the cheating way to
get 4-times as much data!)
• Can also look for a linear correlation between
depression scores and time.
• In SAS:
proc reg data=long;
model score=chem time;
run;
20
Graphically…
Naïve linear regression here looks for significant slopes (ignoring
correlation between individuals):

Y= 24.90889 - 0.557778time. Y=42.44831-0.01685chem

N=24—as if we have 24 independent observations!

21
The model

The linear regression model:

22
Results…
The fitted model:

Parameter Standard
Variable DF Estimate Error t Value Pr > |t|

Intercept 1 42.46803 6.06410 7.00 <.0001

chem 1 -0.01704 0.00550 -3.10 0.0054
time 1 0.07466 0.64946 0.11 0.9096

1-unit increase in chemical is associated

with a .0174 decrease in depression score
(1.7 points per 100 units chemical)

Each month is associated only with a .07

increase in depression score, after
correcting for chemical changes. 23
Generalized Estimating
Equations (GEE)
• GEE takes into account the dependency of
observations by specifying a “working
correlation structure.”
• Let’s briefly look at the model (we’ll return
to it in detail later)…

24
The model…

Measures linear correlation between chemical levels and depression scores

across all 4 time periods. Vectors!
Measures linear correlation between time and depression scores.
CORR represents the correction for correlation between observations.

A significant beta 1 (chem effect) here would mean either that people who have high levels
of chemical also have low depression scores (between-subjects effect), or that people
whose chemical levels change correspondingly have changes in depression score
(within-subjects effect), or both.
25
SAS code (long form of data!!)

Generalized Linear models (using MLE)…

proc genmod data=long4;

class id;
model score=chem time;
repeated subject = id / type=exch corrw; run; quit;

The type of correlation structure…

Time is continuous (do not place on NOTE, for time-dependent predictors…

class statement)! --Interaction term with time (e.g. chem*time) is
NOT necessary to get a within-subjects effect.
Here we are modeling as a linear
--Would only be included if you thought there was
relationship with score. an acceleration or deceleration of the chem effect
with time. 26
Results…
Analysis Of GEE Parameter Estimates
Empirical Standard Error Estimates

Standard 95% Confidence

Parameter Estimate Error Limits Z Pr > |Z|

Intercept 38.2431 4.9704 28.5013 47.9848 7.69 <.0001

chem -0.0129 0.0026 -0.0180 -0.0079 -5.00 <.0001
time -0.0775 0.2829 -0.6320 0.4770 -0.27 0.7841

27
How does GEE work?
• First, a naive linear regression analysis is carried
out, assuming the observations within subjects
are independent.
• Then, residuals are calculated from the naive
model (observed-predicted) and a working
correlation matrix is estimated from these
residuals.
• Then the regression coefficients are refit,
correcting for the correlation. (Iterative process)
• The within-subject correlation structure is treated
as a nuisance variable (i.e. as a covariate)
28
OLS regression
variance-covariance matrix

t1 t2 t3 Correlation structure (pairwise

correlations between time
t1 points) is Independence.

Variance of scores is homogenous across

time (MSE in ordinary least squares
regression). 29
GEE variance-covariance matrix

t1 t2 t3 Correlation structure must be

specified.
t1

Variance of scores is homogenous across

time (residual variance).
30
Choice of the correlation
structure within GEE
In GEE, the correction for within subject correlations is
carried out by assuming a priori a correlation structure for
the repeated measurements (although GEE is fairly robust
against a wrong choice of correlation matrix—particularly
with large sample size)
Choices:
• Independent (naïve analysis)
• Exchangeable (compound symmetry, as in rANOVA)
• Autoregressive
• M-dependent
• Unstructured (no specification, as in rMANOVA)

We are looking for the simplest structure (uses up the fewest degrees of freedom)
that fits data well!
31
Correlation
For unit i

For repeated measures = correl between times l and m

For clustered data = correl between measures l and m
For all models considered here Vi is assumed to be same for
all units

32
Types of correlation
1. Independent: Vi is diagonal
2. Exchangeable: All measurements on the same
unit are equally correlated

Plausible for clustered data

Other terms: spherical and compound symmetry

33
Types of correlation
3. Correlation depends on time or distance between
measurements l and m

e.g. first order auto-regressive model has terms ρ,

ρ2, ρ3 and so on
Plausible for repeated measures where correlation is
known to decline over time
4. Unstructured correlation: no assumptions about the
correlations
Lots of parameters to estimate – may not converge

34
Types of correlation-Summary

35
Independence

t1 t2 t3
t1

36
Exchangeable

t1 t2 t3
t1

Also known as compound symmetry or

sphericity. Costs 1 df to estimate p.
37
Autoregressive

t1 t2 t3 t4
t1
t2
t3
t4

Only 1 parameter estimated.

Decreasing correlation for farther
time periods. 38
M-dependent

t1 t2 t3 t4
t1
t2
t3
t4

Here, 2-dependent. Estimate 2 parameters (adjacent time

periods have 1 correlation coefficient; time periods 2 units of
time away have a different correlation coefficient; others are
39
uncorrelated)
Unstructured

t1 t2 t3 t4
t1
t2
t3
t4

Estimate all correlations

separately (here 6)
40
How GEE handles missing data

Uses the “all available pairs” method, in

which all non-missing pairs of data are
used in the estimating the working
correlation parameters.
Because the long form of the data are
being used, you only lose the
observations that the subject is
missing, not all measurements.
41
Back to our example…

What does the empirical correlation matrix look like

for our data?
Pearson Correlation Coefficients, N = 6
Prob > |r| under H0: Rho=0

time1 time2 time3 time4

Independent? time1 1.00000 0.92569 0.69728 0.68635
Exchangeable? 0.0081 0.1236 0.1321

Autoregressive? time2 0.92569 1.00000 0.55971 0.77991

0.0081 0.2481 0.0673
M-dependent?
time3 0.69728 0.55971 1.00000 0.37870
Unstructured? 0.1236 0.2481 0.4591

time4 0.68635 0.77991 0.37870 1.00000

0.1321 0.0673 0.4591

42
Back to our example…

I previously chose an exchangeable correlation matrix…

proc genmod data=long4;

class id;
model score=chem time;
repeated subject = id / type=exch corrw; run; quit;

This asks to see the

working correlation
matrix.

43
Working Correlation Matrix

Working Correlation Matrix

Col1 Col2 Col3 Col4

Row1 1.0000 0.7276 0.7276 0.7276

Row2 0.7276 1.0000 0.7276 0.7276
Row3 0.7276 0.7276 1.0000 0.7276
Row4 0.7276 0.7276 0.7276 1.0000

Standard 95% Confidence

Parameter Estimate Error Limits Z Pr > |Z|

Intercept 38.2431 4.9704 28.5013 47.9848 7.69 <.0001

chem -0.0129 0.0026 -0.0180 -0.0079 -5.00 <.0001
time -0.0775 0.2829 -0.6320 0.4770 -0.27 0.7841
44
Compare to autoregressive…

proc genmod data=long4;

class id;
model score=chem time;
repeated subject = id / type=ar corrw;
run; quit;

45
Working Correlation Matrix
Working Correlation Matrix

Col1 Col2 Col3 Col4

Row1 1.0000 0.7831 0.6133 0.4803

Row2 0.7831 1.0000 0.7831 0.6133
Row3 0.6133 0.7831 1.0000 0.7831
Row4 0.4803 0.6133 0.7831 1.0000

Analysis Of GEE Parameter Estimates

Empirical Standard Error Estimates

Standard 95% Confidence

Parameter Estimate Error Limits Z Pr > |Z|

Intercept 36.5981 4.0421 28.6757 44.5206 9.05 <.0001

chem -0.0122 0.0015 -0.0152 -0.0092 -7.98 <.0001
time 0.1371 0.3691 -0.5864 0.8605 0.37 0.7104
46
Example two…recall…

From rANOVA:
Within subjects effects,
but no between subjects
effects.
Time is significant.
Group*time is not
significant.
Group is not significant.

This is an example with a

binary time-independent
predictor.
47
Empirical Correlation

Pearson Correlation Coefficients, N = 6

Prob > |r| under H0: Rho=0

time1 time2 time3 time4

Independent? time1 1.00000 -0.13176 -0.01435 -0.50848

0.8035 0.9785 0.3030
Exchangeable?
time2 -0.13176 1.00000 -0.02819 -0.17480
Autoregressive? 0.8035 0.9577 0.7405

M-dependent? time3 -0.01435 -0.02819 1.00000 0.69419

0.9785 0.9577 0.1260
Unstructured?
time4 -0.50848 -0.17480 0.69419 1.00000
0.3030 0.7405 0.1260

48
GEE analysis

proc genmod data=long;

class group id;
model score= group time group*time;
repeated subject = id / type=un corrw ;
run; quit;

NOTE, for time-independent predictors…

--You must include an interaction term with time to get a
within-subjects effect (development over time).

49
Working Correlation Matrix
Working Correlation Matrix
Col1 Col2 Col3 Col4
Group A is on average 8 points higher;
Row1 1.0000 -0.0701 0.1916 -0.1817 there’s an average 5 point drop per
Row2 -0.0701 1.0000 0.1778 -0.5931 time period for group B, and an
Row3 0.1916 0.1778 1.0000 0.5931 average 4.3 point drop more for group
A.
Row4 -0.1817 -0.5931 0.5931 1.0000

Analysis Of GEE Parameter Estimates

Empirical Standard Error Estimates

Standard 95% Confidence

Parameter Estimate Error Limits Z Pr > |Z|

Intercept 42.1433 6.2281 29.9365 54.3501 6.77 <.0001

group A 7.8957 6.6850 -5.2065 20.9980 1.18 0.2376
group B 0.0000 0.0000 0.0000 0.0000 . .
time -4.9184 2.0931 -9.0209 -0.8160 -2.35 0.0188
time*group A -4.3198 2.1693 -8.5716 -0.0680 -1.99 0.0464
GEE analysis

proc genmod data=long;

class group id;
model score= group time group*time;
repeated subject = id / type=exch corrw ;
run; quit;

51
Working Correlation Matrix
Working Correlation Matrix

Col1 Col2 Col3 Col4

Row1 1.0000 -0.0529 -0.0529 -0.0529

Row2 -0.0529 1.0000 -0.0529 -0.0529
Row3 -0.0529 -0.0529 1.0000 -0.0529
Row4 -0.0529 -0.0529 -0.0529 1.0000

Analysis Of GEE Parameter Estimates

Empirical Standard Error Estimates

Standard 95% Confidence

Parameter Estimate Error Limits Z Pr > |Z|

Intercept 40.8333 5.8516 29.3645 52.3022 6.98 <.0001

group A 7.1667 6.1974 -4.9800 19.3133 1.16 0.2475
group B 0.0000 0.0000 0.0000 0.0000 . .
time -5.1667 1.9461 -8.9810 -1.3523 -2.65 0.0079
time*group A -3.5000 2.2885 -7.9853 0.9853 -1.53 0.1262
Notation
• Repeated measurements: y , i = 1,… N, subjects; j = 1, … ni,
ij
times for subject i

• Clustered data: y , i = 1,… N, clusters; j = 1, … ni,

ij
measurements within cluster i
• Use “unit” for subject or cluster

53
Generalized linear model (GLM)

54
Generalized estimating equations
(GEE)

55
Generalized estimating equations

Di is the matrix of derivatives δμi/δβj

Vi is the ‘working’ covariance matrix of Yi
Ai=diag{var(Yik)},
Ri is the correlation matrix for Yi
φ is an overdispersion parameter

56
Overdispersion parameter
Estimated using the formula:

Where N is the total number of measurements and

p is the number of regression parameters
The square root of the overdispersion parameter
is called the scale parameter

57
Estimation – For GEEs

58
Iterative process for GEE’s
• Start with Ri=identity (ie independence) and φ=1:
estimate β
• Use estimates to calculated fitted values:

• And residuals:
• These are used to estimate Ai, Ri and φ
• Then the GEE’s are solved again to obtain improved
estimates of β

59
Steps for fitting GEE (first starts GEE model)

60
Sandwich variance or empirically corrected

61
Missing Data

For missing data, can estimate the working

correlation using the all available pairs method,
in which all non-missing pairs of data are used in
the estimators of the working correlation
parameters.

62
Choosing the Best Model
Standard Regression (GLM)

AIC = - 2log likelihood + 2(#parameters)

→ Values closer to zero indicate better fit and

greater parsimony.

63
Choosing the Best Model
GEE
QIC(V) – function of V, so can use to choose
best correlation structure.
QICu – measure that can be used to
determine the best subsets of covariates
for a particular model.
the best model is the one with the smallest
value!
64

GEE and Mixed Models For Longitudinal Data
No ratings yet
GEE and Mixed Models For Longitudinal Data
76 pages
GEE and Mixed Models
No ratings yet
GEE and Mixed Models
76 pages
Lecture 08 Nonlinearity
No ratings yet
Lecture 08 Nonlinearity
26 pages
Mathallcodes 1
No ratings yet
Mathallcodes 1
32 pages
Assign 4
No ratings yet
Assign 4
6 pages
Lecture 7
No ratings yet
Lecture 7
67 pages
Tutorials2016s1 Week7 Answers-3
No ratings yet
Tutorials2016s1 Week7 Answers-3
5 pages
Eutenico y Valorizacion Enero
No ratings yet
Eutenico y Valorizacion Enero
20 pages
Resultados Auten 31 Mayo
No ratings yet
Resultados Auten 31 Mayo
14 pages
Redsultados Conflic
No ratings yet
Redsultados Conflic
19 pages
Maths Project On Statistics
100% (1)
Maths Project On Statistics
7 pages
Stat 412 - M - 2022
No ratings yet
Stat 412 - M - 2022
21 pages
Data Pre Processing 1
No ratings yet
Data Pre Processing 1
35 pages
PSYCH278 Week 2 Lab
No ratings yet
PSYCH278 Week 2 Lab
8 pages
Soledad Enero
No ratings yet
Soledad Enero
12 pages
Analyzing Within vs. Between Subject Effects
No ratings yet
Analyzing Within vs. Between Subject Effects
34 pages
Advanced Statistics Problems (New) 1
No ratings yet
Advanced Statistics Problems (New) 1
5 pages
Resultados Reac 31 Mayo
No ratings yet
Resultados Reac 31 Mayo
14 pages
Resultados Inhibicion
No ratings yet
Resultados Inhibicion
18 pages
Decision Science: Sales (INR 000s)
No ratings yet
Decision Science: Sales (INR 000s)
17 pages
Resultados Inhibicion Mayo 20
No ratings yet
Resultados Inhibicion Mayo 20
15 pages
Grubbs' Outlier Test
No ratings yet
Grubbs' Outlier Test
2 pages
Inflasi ARIMA Model
No ratings yet
Inflasi ARIMA Model
7 pages
Salida Posta
No ratings yet
Salida Posta
12 pages
Resultados Soledad 31 Mayo
No ratings yet
Resultados Soledad 31 Mayo
12 pages
Nec Val Resultados Mayo 26
No ratings yet
Nec Val Resultados Mayo 26
17 pages
Data Set
No ratings yet
Data Set
7 pages
Sas 202
No ratings yet
Sas 202
5 pages
Resultados Factor Fusion Mayo
No ratings yet
Resultados Factor Fusion Mayo
15 pages
Chi-Square Test of Homogeneity: Example
No ratings yet
Chi-Square Test of Homogeneity: Example
55 pages
Da
No ratings yet
Da
2 pages
MAT 2001 Challenging Experiment 2
No ratings yet
MAT 2001 Challenging Experiment 2
12 pages
Statistics
No ratings yet
Statistics
8 pages
Friedmanns Test Statisitical Computation
No ratings yet
Friedmanns Test Statisitical Computation
1 page
General Item Analysis
No ratings yet
General Item Analysis
95 pages
Diabetes Dataset Analysis & Prep
No ratings yet
Diabetes Dataset Analysis & Prep
11 pages
Generalized Estimating Equations (Gees)
No ratings yet
Generalized Estimating Equations (Gees)
40 pages
TMP 005
No ratings yet
TMP 005
2 pages
GEE Annette
No ratings yet
GEE Annette
40 pages
It-3031 (DMDW) - Cs End April 2024
No ratings yet
It-3031 (DMDW) - Cs End April 2024
22 pages
5103A1
No ratings yet
5103A1
6 pages
Linear Regression for Beginners
No ratings yet
Linear Regression for Beginners
6 pages
Practical 10
No ratings yet
Practical 10
22 pages
Inbound 463564263383757458
No ratings yet
Inbound 463564263383757458
3 pages
Sabina Banda Biostatistics
No ratings yet
Sabina Banda Biostatistics
13 pages
Business Tools Decision Making Guide
No ratings yet
Business Tools Decision Making Guide
7 pages
II PU STATISTICSudupi
No ratings yet
II PU STATISTICSudupi
4 pages
Tool For Forecating (Chap 4)
No ratings yet
Tool For Forecating (Chap 4)
22 pages
Introduction To Statistics2
No ratings yet
Introduction To Statistics2
66 pages
Multiple Linear Regression and Checking For Collinearity Using SAS
0% (1)
Multiple Linear Regression and Checking For Collinearity Using SAS
18 pages
Data Presentation, Analysis and Interpretation
No ratings yet
Data Presentation, Analysis and Interpretation
57 pages
Item Analysis Latest
No ratings yet
Item Analysis Latest
20 pages
Elementary Statistics Exam Guide
No ratings yet
Elementary Statistics Exam Guide
25 pages
Dự báo và phát triển kinh doanh
No ratings yet
Dự báo và phát triển kinh doanh
43 pages
CHP 2 Mat161
No ratings yet
CHP 2 Mat161
12 pages
Resultados Comunicacion
No ratings yet
Resultados Comunicacion
12 pages
Resultados Comunicacion
No ratings yet
Resultados Comunicacion
12 pages
Introductory Statistical Concepts
100% (1)
Introductory Statistical Concepts
118 pages
Math T STPM Sem 3 2018
No ratings yet
Math T STPM Sem 3 2018
2 pages
Introduction To Descriptive Statistics
No ratings yet
Introduction To Descriptive Statistics
12 pages
Probability & Statistics Problem Set
No ratings yet
Probability & Statistics Problem Set
2 pages
1.3.6.6.8. Weibull Distribution
No ratings yet
1.3.6.6.8. Weibull Distribution
5 pages
Joreskog
No ratings yet
Joreskog
11 pages
Summary of T-Test (Rizqi Fadhilah)
No ratings yet
Summary of T-Test (Rizqi Fadhilah)
6 pages
15-The Effects of Various Variables On University Students' Writer's Block Levels
No ratings yet
15-The Effects of Various Variables On University Students' Writer's Block Levels
12 pages
SPReg
No ratings yet
SPReg
46 pages
Sampling
No ratings yet
Sampling
23 pages
H3 Test 3 Practice Questions
No ratings yet
H3 Test 3 Practice Questions
26 pages
Module 02.2 Moving Average Weighted Moving Average and Exponential Smoothing
No ratings yet
Module 02.2 Moving Average Weighted Moving Average and Exponential Smoothing
6 pages
Week 1 - Introduction To Statistics PDF
No ratings yet
Week 1 - Introduction To Statistics PDF
34 pages
The Mann Whitney or Wilcoxon Rank
No ratings yet
The Mann Whitney or Wilcoxon Rank
6 pages
Econometrics Lecture4 MultipleRegression
No ratings yet
Econometrics Lecture4 MultipleRegression
40 pages
Newbold Sbe8 Ch02
No ratings yet
Newbold Sbe8 Ch02
59 pages
Statistics II - K.C.S.E Topical Questions & Answers
92% (12)
Statistics II - K.C.S.E Topical Questions & Answers
2 pages
PM Project Logistic Regression LDA
No ratings yet
PM Project Logistic Regression LDA
22 pages
Test Bank Chap 3 PDF
100% (1)
Test Bank Chap 3 PDF
36 pages
Linear Regression Terms
No ratings yet
Linear Regression Terms
4 pages
Complete Answer Guide For Discovering Statistics Using IBM SPSS Statistics Field 4th Edition Test Bank
No ratings yet
Complete Answer Guide For Discovering Statistics Using IBM SPSS Statistics Field 4th Edition Test Bank
51 pages
Rainfall Prediction
No ratings yet
Rainfall Prediction
29 pages
ECON 322 ECONOMETRICS II - Kabarak University
No ratings yet
ECON 322 ECONOMETRICS II - Kabarak University
6 pages
Know The Preference of Consumer On Hero-Honda Two-Wheelers
No ratings yet
Know The Preference of Consumer On Hero-Honda Two-Wheelers
73 pages
Bcom 262
No ratings yet
Bcom 262
3 pages
Solutions
No ratings yet
Solutions
8 pages
Module 4
No ratings yet
Module 4
18 pages
CHAPTER 7 (SAS Session) 2023
No ratings yet
CHAPTER 7 (SAS Session) 2023
137 pages
Statistical Analysis in Microbiology StatNotes
0% (1)
Statistical Analysis in Microbiology StatNotes
173 pages
Nguyen Thanh Mai - RP A2 - F16039 - F16B
No ratings yet
Nguyen Thanh Mai - RP A2 - F16039 - F16B
56 pages

Generalized Linear Models Generalized Estimating Equations Source: Book and Internet

Uploaded by

Generalized Linear Models Generalized Estimating Equations Source: Book and Internet

Uploaded by

Generalized Linear Models

Generalized Estimating Equations

Source: Book and internet

Dr. Md Jamal Uddin

Statistics, SUST, Sylhet

Treatment groups Measurement times

E.g., Level 3: populations

id time1 time2 time3 time4 chem1 chem2 chem3 chem4

1 20 18 15 20 1000 1100 1200 1300

Data in long1 2 18 1100

The only way to force a rANOVA here is…

What are the solutions ?

• We need more complicated models!

These are used to model correlated data from

Y= 24.90889 - 0.557778*time. Y=42.44831-0.01685*chem

N=24—as if we have 24 independent observations!

The linear regression model:

Intercept 1 42.46803 6.06410 7.00 <.0001

1-unit increase in chemical is associated

Each month is associated only with a .07

Measures linear correlation between chemical levels and depression scores

Generalized Linear models (using MLE)…

proc genmod data=long4;

The type of correlation structure…

Time is continuous (do not place on NOTE, for time-dependent predictors…

Standard 95% Confidence

Intercept 38.2431 4.9704 28.5013 47.9848 7.69 <.0001

t1 t2 t3 Correlation structure (pairwise

Variance of scores is homogenous across

t1 t2 t3 Correlation structure must be

Variance of scores is homogenous across

For repeated measures = correl between times l and m

Plausible for clustered data

e.g. first order auto-regressive model has terms ρ,

Also known as compound symmetry or

Only 1 parameter estimated.

Here, 2-dependent. Estimate 2 parameters (adjacent time

Estimate all correlations

Uses the “all available pairs” method, in

What does the empirical correlation matrix look like

time1 time2 time3 time4

Autoregressive? time2 0.92569 1.00000 0.55971 0.77991

time4 0.68635 0.77991 0.37870 1.00000

I previously chose an exchangeable correlation matrix…

proc genmod data=long4;

This asks to see the

Working Correlation Matrix

Col1 Col2 Col3 Col4

Row1 1.0000 0.7276 0.7276 0.7276

Standard 95% Confidence

Intercept 38.2431 4.9704 28.5013 47.9848 7.69 <.0001

proc genmod data=long4;

Col1 Col2 Col3 Col4

Row1 1.0000 0.7831 0.6133 0.4803

Analysis Of GEE Parameter Estimates

Standard 95% Confidence

Intercept 36.5981 4.0421 28.6757 44.5206 9.05 <.0001

This is an example with a

Pearson Correlation Coefficients, N = 6

time1 time2 time3 time4

Independent? time1 1.00000 -0.13176 -0.01435 -0.50848

M-dependent? time3 -0.01435 -0.02819 1.00000 0.69419

proc genmod data=long;

NOTE, for time-independent predictors…

Analysis Of GEE Parameter Estimates

Standard 95% Confidence

Intercept 42.1433 6.2281 29.9365 54.3501 6.77 <.0001

proc genmod data=long;

Col1 Col2 Col3 Col4

Row1 1.0000 -0.0529 -0.0529 -0.0529

Analysis Of GEE Parameter Estimates

Standard 95% Confidence

Intercept 40.8333 5.8516 29.3645 52.3022 6.98 <.0001

• Clustered data: y , i = 1,… N, clusters; j = 1, … ni,

Di is the matrix of derivatives δμi/δβj

Where N is the total number of measurements and

Y= 24.90889 - 0.557778time. Y=42.44831-0.01685chem

AIC = - 2log likelihood + 2(#parameters)