A Checklist For Testing Measurement Invariance: A.g.j.vandeschoot@uu - NL

This document provides a step-by-step guide for testing measurement invariance across groups. It discusses establishing configural invariance by fitting the same confirmatory factor analysis model separately to each group. It then describes testing for metric invariance by constraining factor loadings to be equal across groups, scalar invariance by further constraining intercepts to be equal, and full uniqueness invariance by also constraining residual variances to be equal. Establishing these levels of invariance allows valid comparisons of latent variable means and observed scores across groups.

Uploaded by

CUCUTA JUAN DIEGO HERNANDEZ LALINDE

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

77 views10 pages

A Checklist For Testing Measurement Invariance: A.g.j.vandeschoot@uu - NL

Uploaded by

CUCUTA JUAN DIEGO HERNANDEZ LALINDE

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

A checklist for testing measurement invariance

Rens van de Schoot*1,2 Peter Lugtig1 Joop Hox1

1
Faculty of Social Science, Department of Methods and Statistics, Utrecht University, The
Netherlands
2
Optentia Research Program, Faculty of Humanities, North-West University, Vanderbijlpark,
South Africa

*Correspondence should be addressed to Rens van de Schoot: Department of Methodology and

Statistics, Utrecht University, P.O. Box 80.140, 3508TC, Utrecht, The Netherlands; Tel.: +31
302534468; Fax: +31 2535797; E-mail address: a.g.j.vandeschoot@uu.nl

Acknowledgement: The first author received a grant from the Netherlands Organization for
Scientific Research: NWO-VENI-451-11-008. With many thanks to Marie Stievenart, Stefanos
Mastrotheodoros, Leonard Vanbrabant and Esmee Verhulp for proofreading the manuscript.
A checklist for testing measurement invariance

Abstract
The analysis of measurement invariance of latent constructs is important in research across
groups, or across time. By establishing whether factor loadings, intercepts and residual
variances are equivalent in a factor model that measures a latent concept, we can assure that
comparisons that are made on the latent variable are valid across groups or time. Establishing
measurement invariance involves running a set of increasingly constrained Structural Equation
Models, and testing whether differences between these models are significant. This paper
provides a step-by-step guide in analyzing Measurement Invariance.

Keywords: confirmatory factor analysis, validity, measurement invariance

In the social and behavioral sciences self-report questionnaires are often used to assess
different aspects of human behavior. These questionnaires consist of items that are developed
to assess an underlying phenomenon with the goal to follow individuals over time or to
compare groups. To be valid for such a comparison a questionnaire should measure identical
constructs with the same structure across different groups. When this is the case, the
questionnaire is called measurement invariant (MI). If MI can be demonstrated then the
participants across all groups interpret the individual questions, as well as the underlying latent
factor in the same way. Having determined MI, future studies can compare the occurrence,
determinants, and consequences of the latent factor scores. When MI does not hold, groups or
subjects over time respond differently to the items and as a consequence factor means cannot
reasonably be compared.
Jöreskog (1971) was the first author to write about the equivalence of factor structures.
The concept of MI was introduced by Byrne, Shavelson & Muthen (1989), after which the
testing of MI took off. Recent review articles provided an overview of a multitude of
substantive studies that tested MI (e.g., Vandenberg & Lance, 2000). However, a simple step-
by-step checklist for testing MI is lacking and that is exactly the goal of the current paper.
Software

MI can be tested using any Structural Equation Modeling software program. Lisrel
(Jöreskog & Sorbom 1996-2001) was long the best option. It can handle categorical data, but it
requires syntax and knowledge of matrix algebra. AMOS (Arbuckle 2007) is very user-friendly,
but has limited capabilities for handling categorical data. Mplus (Muthén & Muthén, 2010) is
currently the most flexible program, but requires knowledge of syntax. Lavaan (Rosseel, in
press) and OpenMx (Boker.et al, 2011) are both open-source R packages that are still being
developed. We provide Mplus syntax on www.fss.uu.nl/mplus.for all the analyses
descried in the current paper.
Model fit and model comparison
The most commonly used test to check global model fit is the χ2 test (Cochran, 1952), but is
dependent on the sample size: it rejects reasonable models if sample is large and it fails to
reject poor models if sample is rather small. There are three other types of fit indices that can
be used to assess the fit of a model. For details and references see Kline (2010).
First, the comparative indices that compare the fit of the model under consideration with
fit of baseline-model, for example the TLI, and CFI. Fit is considered adequate if the CFI and
TLI values are > 0.90, better if they are >.95. The TLI attempts to correct for complexity of the
model but is somewhat sensitive to a small sample size. Also, it can become > 1.0 which can be
interpreted as an indication of over fitting: making the model more complex than needed. If the
χ2 < df , the CFI is set to 1.0, which makes it a normed fit index.
Second, there are absolute indices that examine closeness of fit, for example the
RMSEA. The cut-off value is RMSEA < 0.08, better is <.05. The RMSEA is insensitive to
sample size, but sensitive to model complexity..
Third, there are information theoretic indices, for example the AIC and BIC. Both can
be used to compare competing models and make a tradeoff between model fit (i.e., -2*log
likelihood value) and model complexity (i.e., a computation of the number of parameters). A
lower IC value indicates a better tradeoff between fit and complexity. There is no rule of
thumb, the values depend on actual dataset and the model, simply chooses the model with the
lowest IC value.

The factor model

Consider Figure 1 which is a one item questionnaire, denoted by X. We assume there is an
underlying mechanism causing the variance in X, denoted by the latent variable ksi. The
regression equation is

X = b0 + b1×ksi + b2×error (1)

where b0 is the intercept, b1 is the regression coefficient (the factor loading in the standardized
solution) between the latent variable and the item, and b2 is the regression coefficient between
the residual variance (i.e., error) and the manifest item. For model identification purposes this
latter coefficient is fixed to equal 1. Note that if the means of ksi and the error are constrained
at zero, the intercept of X is estimated. If, on the other hand, the intercept and the error mean
are constrained at zero, then the mean of ksi is estimated.
As a result, there are two ways of parameterization of the CFA model. This is illustrated
in Figure 2 where three items, X1-X3, are believed to measure the same underlying latent
variable ksi. Firstly, if the latent factor mean is constrained to equal 0 and the variance equal to
1, then all factor loadings and all intercepts are estimated, see Figure 2A. Secondly, if one
factor loading is constrained to equal 1, and the corresponding intercept equal to zero, then the
other factor loadings, the other intercepts, and the factor mean plus its variance are estimated.
So, depending on what information you want to report either the parameterization in Figure 2A
or the parameterization in Figure 2B should be applied. Basically the question boils down to:

• Do you want to compare the factor loadings across groups? Then, choose the
parameterization in Figure 2A;

• Do you want to compare the latent means across groups? Then, choose the
parameterization in Figure 2B. Note that the parameterization of Figure 2B is the
default in AMOS, Lavaan and Mplus.
Sometimes you have to switch between parameterization within one paper to answer both
questions.
Testing for Measurement invariance
In this section we discuss all the steps necessary to evaluate MI. See the supplementary
material on www.fss.uu.nl/mplus for Mplus syntax.
Before testing invariance, it is important that the data have been properly screened. If
one of the groups contains more (multivariate) outliers than the other group. MI studies rely on
fitting the observed covariance matrix (the data) to a model, so any bias in one of the groups
due to outliers will affect factor loadings, intercepts and error variances.
Start with specifying a Confirmatory Factor Analysis (CFA) that reflects how the
construct is theoretically operationalized. This CFA-model should be fitted for each group
separately to test for configural invariance: whether the same CFA is valid in each group.
Basically, this boils down to selecting each of the groups separately and run the CFA multiple
times, or to run a multiple group analysis without any equality constraints
To test for MI a set of models need to be estimated.
1. Run a model where only the factor loadings are equal across groups but the intercepts
are allowed to differ between groups. This is called metric invariance and tests whether
respondents across groups attribute the same meaning to the latent construct under
study.
2. Run a model where only the intercepts are equal across groups, but the factor loadings
are allowed to differ between groups. This tests whether the meaning of the levels of the
underlying items (intercepts) are equal in both groups.
3. Run a model where the loadings and intercepts are constrained to be equal. This is
called scalar invariance and implies that the meaning of the construct (the factor
loadings), and the levels of the underlying items (intercepts) are equal in both groups.
Consequently, groups can be compared on their scores on the latent variable .
4. Run a model where also the residual variances are fixed to be equal across groups. This
is called full uniqueness MI and means that the explained variance for every item is the
same across groups. Put more strongly, the latent construct is measured identically
across groups. If error variances are not equal, groups can still be compared on the
latent variable, but this is measured with different amounts of error between groups.
For straightforward interpretation of latent means and correlations across groups, both the
factor loadings and intercepts should be the same across groups (scalar invariance). On the
other hand, if the fit of Model 3 is significantly worse than Model 1 or 2, you can still try to
establish partial MI (Steenkamp & Baumgartner 1998).
The goal of tests of partial MI is to find out which of the loadings or intercepts differ
across groups. If only one of these is different across groups, we know that any differences on
the latent variable can either be caused by a difference in this loading/intercept, or by the true
latent variable group difference. As long as there are at least two loadings and intercepts that
are constrained equal across groups, we can make valid inferences about the differences
between latent factor means in the model (Byrne, Shavelson & Muthén, 1989). However, to be
able to compare the sum scores or comparable observed means, we must have full scalar
equivalence (Steinmetz, in press). If it can be establish which specific item is problematic,
questionnaires can be altered in future (Lugtig, Boeije, & Lensvelt-Mulders 2011).
To establish partial invariance, choose between model 1 or 2. Study the size of the
loadings and/or intercepts, and constrain all loadings and intercepts, except for the one
loading/intercept with the largest unstandardized difference which is released. Subsequently,
compare this new model with the old model 1 or 2. If Δχ2 is now insignificant, partial
invariance is established. If Δχ2 is still significant release another item, and continue until the
item that causes MI not to hold is identified.
Reporting the results
After testing the invariance of the measurement model, the next step is to test the equality of
factor means and correlations between the latent variables, across groups. Remember from the
section on the parameterization of the CFA, that if we are interested in comparing the latent
means across groups, we need the parameterization in Figure 2B. Note that if you constrain the
factor mean to be zero in one of the groups the estimated latent factor means in the other
groups tests for significant differences between the groups.
Reporting on MI results can be cumbersome to applied researchers, as it involves
testing many different models, and reporting both on the model results (the size of the factor
loadings, intercepts etc.) as well as the model fit. As a rule, first report on the model fit of every
model, and use summary tables to give an overview of all models tested. Once it is established
what level of MI holds, report the results only for the final model. Example text:
The CFA model with the unconstrained factor loadings and intercepts is shown in Figure
1.,Two CFA's were conducted for group 1 (χ2=; p=, CFI=; TLI=; RMSEA=), and group
2 (χ2=; p=, CFI=; TLI=; RMSEA=), separately. Next, we tested for measurement
invariance, see Table 1 for the fit indices. Model X has the lowest AIC/BIC value and
therefore the best trade-off between model fit and model complexity. The other fit indices
of Model X indicated a good fit. Compared to the group 2, group 1 appeared to have a
significantly lower mean factor score (∆M=; p=).

References
Arbuckle, J.L. (2007). Amos 16.0 User’s Guide. Spring House, PA.
Boker, S., Neale, M., Maes, H., Wilde, M., Spiegel, M., et al. (2011). OpenMx: An Open
Source Extended Structural Equation Modeling Framework. Psychometrika, 76, 306-317.
Byrne, B.M., Shavelson, R.J., & Muthén, B.O. (1989). Testing for equivalence of factor
covariance and mean structures: The issue of partial measurement invariance.
Psychological Bulletin, 105, 456-466.

Cochran, W.G. (1952). The χ2 test of goodness of fit. Annals of Mathematical Statistics, 23,
315-345.
Jöreskog, K.G. (1971). Simultaneous factor analysis in several populations. Psychometrika, 36,
409-426
Jöreskog, K.G. & Sörbom, D. (1996-2001). LISREL 8: User’s Reference Guide, 2nd ed.
Lincolnwood, Scientific Software International,.
Kline, R.B. (2010) Principles and Practice of Structural Equation Modeling (3rd Edition). The
Guilford Press,
Lugtig, P., Boeije, H., & Lensvelt-Mulders, G.J.L.M. (2011) Change, what change?
Methodology,
Muthén, L.K. & Muthén, B.O. (2010). Mplus User's Guide (6th Ed). Los Angeles, CA:
Muthén&Muthén.
Rosseel, Y. (in press). lavaan: an R package for structural equation modeling. Journal of
Statistical Software.
Steenkamp, J.M. & Baumgartner, H. (1998). Assessing measurement invariance in cross-
national consumer research. Journal of Consumer Research, 25, 78-90
Steinmetz, H. (in press). Analyzing observed composite differences across groups: Is partial
measurement invariance enough? Methodology.
Vandenberg, R.J., & Lance, C.E. (2000). A review and synthesis of the measurement
invariance literature: Suggestions, practices, and recommendations for organizational
research. Organizational Research Methods, 3, 4-70.
Table 1

χ² df p CFI TLI RMSEA BIC AIC

Model 1
Model 2
Model 3

Model 4
Figure 1. CFA with one item.
(A) (B)

Figure 2. Two ways of parametrization the CFA model.

Measurement Invariance Review of Practice and Implications
No ratings yet
Measurement Invariance Review of Practice and Implications
13 pages
T8 AMOS Structural Equation Modeling IPSAS
No ratings yet
T8 AMOS Structural Equation Modeling IPSAS
44 pages
SEM Model Fit Analysis Guide
No ratings yet
SEM Model Fit Analysis Guide
9 pages
Multigroup SEM Analysis & Moderation
No ratings yet
Multigroup SEM Analysis & Moderation
5 pages
Amos Model Fit
No ratings yet
Amos Model Fit
12 pages
CLPM Tutorial Paper - Jan5 - 2020
No ratings yet
CLPM Tutorial Paper - Jan5 - 2020
39 pages
Measuring Model FitSEpt26 - 2k5
No ratings yet
Measuring Model FitSEpt26 - 2k5
4 pages
Cba101 MT
No ratings yet
Cba101 MT
4 pages
Modelos Estructurales
No ratings yet
Modelos Estructurales
8 pages
Advanced Multivariate Analysis
No ratings yet
Advanced Multivariate Analysis
11 pages
Cheung & Rensvold 2002
100% (1)
Cheung & Rensvold 2002
24 pages
Pilot Testing & Data Analysis Guide
No ratings yet
Pilot Testing & Data Analysis Guide
78 pages
Transitivity of Preference
No ratings yet
Transitivity of Preference
5 pages
Paper2scheam 2
No ratings yet
Paper2scheam 2
28 pages
IOT Solution2
No ratings yet
IOT Solution2
18 pages
Chapter 19, Factor Analysis
No ratings yet
Chapter 19, Factor Analysis
7 pages
Structural Equation Modeling
No ratings yet
Structural Equation Modeling
14 pages
Measurement Invariance With Categorical Indicators in Mplus
No ratings yet
Measurement Invariance With Categorical Indicators in Mplus
39 pages
Correspondent Analysis
No ratings yet
Correspondent Analysis
26 pages
Jacob Gray Two Columns
No ratings yet
Jacob Gray Two Columns
12 pages
Lecture 6 (Data Analysis and Interpretation)
No ratings yet
Lecture 6 (Data Analysis and Interpretation)
18 pages
tmp7B5E TMP
No ratings yet
tmp7B5E TMP
14 pages
Survey Data Quality - Rasch
No ratings yet
Survey Data Quality - Rasch
42 pages
Analisis Data Inferensi
No ratings yet
Analisis Data Inferensi
17 pages
Ho Fit
No ratings yet
Ho Fit
4 pages
Structural Equation Modeling: A Multidisciplinary Journal
No ratings yet
Structural Equation Modeling: A Multidisciplinary Journal
25 pages
Aspects of Multivariate Analysis
No ratings yet
Aspects of Multivariate Analysis
50 pages
Comprehensive Data Analysis Guide
No ratings yet
Comprehensive Data Analysis Guide
16 pages
Lecture 12 (Data Analysis and Interpretation
No ratings yet
Lecture 12 (Data Analysis and Interpretation
14 pages
Basis For CH 4
No ratings yet
Basis For CH 4
11 pages
Confirmatory Factor Analysis
No ratings yet
Confirmatory Factor Analysis
8 pages
SEM Fit Indices Overview
No ratings yet
SEM Fit Indices Overview
5 pages
Week 1 Istat
No ratings yet
Week 1 Istat
79 pages
AMR Concept Note-1 (Freq Dist, Cross Tab, T-Test and ANOVA)
No ratings yet
AMR Concept Note-1 (Freq Dist, Cross Tab, T-Test and ANOVA)
0 pages
Guide To SEM
No ratings yet
Guide To SEM
21 pages
Structural Equation Modeling
No ratings yet
Structural Equation Modeling
8 pages
C01 Introduction S
No ratings yet
C01 Introduction S
20 pages
1) One-Sample T-Test
No ratings yet
1) One-Sample T-Test
5 pages
Statistics Long Essay
No ratings yet
Statistics Long Essay
22 pages
Descriptive Analysis of Data For Variables in A Study Includes Describing The Results Through Means
No ratings yet
Descriptive Analysis of Data For Variables in A Study Includes Describing The Results Through Means
4 pages
E 340
No ratings yet
E 340
6 pages
Unit 9 Multivariate Analysis 4
No ratings yet
Unit 9 Multivariate Analysis 4
4 pages
Assignment 2 Group 1 Report
No ratings yet
Assignment 2 Group 1 Report
13 pages
Inferential Report 1
No ratings yet
Inferential Report 1
7 pages
Ho Multivariate
No ratings yet
Ho Multivariate
4 pages
Aspects of Multivariate Analysis
No ratings yet
Aspects of Multivariate Analysis
4 pages
Data Preparation & Analysis
No ratings yet
Data Preparation & Analysis
27 pages
Inferential Statistics
No ratings yet
Inferential Statistics
4 pages
Confirmatory Factor Analysis Master Thesis
100% (3)
Confirmatory Factor Analysis Master Thesis
5 pages
Identifying Types of Variables
No ratings yet
Identifying Types of Variables
5 pages
CFA & EFA Analysis in AMOS Guide
No ratings yet
CFA & EFA Analysis in AMOS Guide
50 pages
McNeish & Manapat (2023)
No ratings yet
McNeish & Manapat (2023)
47 pages
Data Preparation
100% (1)
Data Preparation
38 pages
Internal Consistency: Do We Really Know What It Is and How To Assess It?
No ratings yet
Internal Consistency: Do We Really Know What It Is and How To Assess It?
16 pages
Epidemiological Studies - A Practical Guide
No ratings yet
Epidemiological Studies - A Practical Guide
256 pages
Testing Measurement Invariance of Second-Order Factor Models
No ratings yet
Testing Measurement Invariance of Second-Order Factor Models
22 pages
A New Criterion For Assesing Discriminant Validity - Henseler
No ratings yet
A New Criterion For Assesing Discriminant Validity - Henseler
21 pages
SPSS Statistics Survey Tips
No ratings yet
SPSS Statistics Survey Tips
30 pages
Brochure Deepseain
No ratings yet
Brochure Deepseain
2 pages
EIM - Measuring Tools
No ratings yet
EIM - Measuring Tools
32 pages
FCN-500 Selection Guide
100% (1)
FCN-500 Selection Guide
17 pages
Christmas Guitar Sheet Music
No ratings yet
Christmas Guitar Sheet Music
2 pages
Forms Mine Rule
No ratings yet
Forms Mine Rule
22 pages
Exercise Set 3.2
No ratings yet
Exercise Set 3.2
9 pages
Worksheet For Speech Prep
No ratings yet
Worksheet For Speech Prep
2 pages
Tesco Store Layout-English
No ratings yet
Tesco Store Layout-English
3 pages
Office Administration Presentation
No ratings yet
Office Administration Presentation
9 pages
Chapter 4 Employee Development and Coaching
No ratings yet
Chapter 4 Employee Development and Coaching
11 pages
Renewi Annual Report 2022 PDF
No ratings yet
Renewi Annual Report 2022 PDF
133 pages
KOYO ELEVATOR (Maniobra BL2000-STB v.9.0)
100% (1)
KOYO ELEVATOR (Maniobra BL2000-STB v.9.0)
28 pages
Class 1 - Introduction To Industrial Drawing
100% (2)
Class 1 - Introduction To Industrial Drawing
41 pages
IPSAS 21& 26 Impairment
No ratings yet
IPSAS 21& 26 Impairment
39 pages
HTTP Proxies
No ratings yet
HTTP Proxies
34 pages
FEE STRUCTURE - 2023 - 2024 (1) Fresh1
No ratings yet
FEE STRUCTURE - 2023 - 2024 (1) Fresh1
2 pages
What Is MP3EI - Indonesia
No ratings yet
What Is MP3EI - Indonesia
1 page
A Study On Production of Pulp From Ground Nut Shells
No ratings yet
A Study On Production of Pulp From Ground Nut Shells
6 pages
22 November 2024
No ratings yet
22 November 2024
15 pages
GRC STAR Scenarios For Interviews - GRC ANALYST-100923-Community v3
No ratings yet
GRC STAR Scenarios For Interviews - GRC ANALYST-100923-Community v3
19 pages
SEO-Optimized Document Title
100% (1)
SEO-Optimized Document Title
2 pages
Class 9 Science Test Paper
No ratings yet
Class 9 Science Test Paper
3 pages
Pump Theory
100% (2)
Pump Theory
167 pages
Milwaukee County Medical Examiner's Investigation Narrative Report
No ratings yet
Milwaukee County Medical Examiner's Investigation Narrative Report
2 pages
Flight Planning & Stereoscopic Vision
100% (1)
Flight Planning & Stereoscopic Vision
13 pages
(Session No 1) : Written By: DR Chan Mieow Kee Highest Qualification: PHD Designation: Senior Lecturer
No ratings yet
(Session No 1) : Written By: DR Chan Mieow Kee Highest Qualification: PHD Designation: Senior Lecturer
9 pages
Set Conference 22mdt1034
No ratings yet
Set Conference 22mdt1034
17 pages
Procurement Summary Report
No ratings yet
Procurement Summary Report
3 pages
Residential Construction Guidelines
No ratings yet
Residential Construction Guidelines
2 pages
DOCMA0000137 UK ENG Icarus 40.14 45.17 60.18 GD STV
No ratings yet
DOCMA0000137 UK ENG Icarus 40.14 45.17 60.18 GD STV
311 pages

A Checklist For Testing Measurement Invariance: A.g.j.vandeschoot@uu - NL

Uploaded by

A Checklist For Testing Measurement Invariance: A.g.j.vandeschoot@uu - NL

Uploaded by

A checklist for testing measurement invariance

Rens van de Schoot*1,2 Peter Lugtig1 Joop Hox1

*Correspondence should be addressed to Rens van de Schoot: Department of Methodology and

Keywords: confirmatory factor analysis, validity, measurement invariance

The factor model

X = b0 + b1×ksi + b2×error (1)

χ² df p CFI TLI RMSEA BIC AIC

Figure 2. Two ways of parametrization the CFA model.

You might also like